Language

This category focuses on the use of specific linguistic techniques, such as prompt injection or stylization, to influence the model's output.

Strategies

Note	Description
Code and Encode	This strategy encompasses techniques that utilize various coding methods, such as Base64 or ROT13, to bypass model restrictions and manipulate outputs.
Prompt Injection	This technique enables attackers to override original instructions and employed controls by crafting specific wording of instructions, often resembling SQL injection methods, to manipulate the model's behavior.
Stylizing	This strategy involves using a method of questioning that encourages critical thinking and exposes biases by subtly referencing identity elements without using direct slurs or toxic language, thereby signaling to the model about certain groups of people.