The GenAI Bug Bounty Program

0x8C6847EC: Multiple Model Guardrail Jailbreak via Investigative Journalist Persona

Nov 04 2025

A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests when an attacker adopts the persona of an investigative journalist preparing a report for government officials. By leveraging this guise, the attacker successfully bypasses restrictions and elicits detailed instructions for methamphetamine production, including chemical processes, ingredients, and thermal ranges, thereby violating inference safeguards around providing information haz...

SEVERITY: Low

BOUNDARY: Prompt Injection

MODELS AFFECTED:

MAX SCORE: 100

0x2206BDF0: Multiple Model Guardrail Jailbreak via "Key-Value Pairs" Tactic

Oct 23 2025

A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests by structuring the request as key-value pairs, interpreted by the model as configuration parameters rather than natural language. This approach successfully bypasses content filters, resulting in for example detailed drug manufacturing instructions, allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.

SEVERITY: Low

BOUNDARY: Prompt Injection

MODELS AFFECTED:

MAX SCORE: 91

0x2D78A1B1: Multiple Model Guardrail Jailbreak via "Abbreviation Expansion"

Oct 23 2025

A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests when an attacker structures a prompt to spell out abbreviations of a nefarious subject in a step-by-step manner before joining them. This tactic allows the attacker to bypass inference restrictions around providing information hazards and violating laws.

SEVERITY: Low

BOUNDARY: Prompt Injection

MODELS AFFECTED:

MAX SCORE: 73

0x259D4DB0: Multiple Model Guardrail Jailbreak via Token Disruption through Random Spacing

Oct 23 2025

A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests when attackers introduce random spacing within a prompt, effectively disrupting how the model tokenizes the input. By splitting words like "Chemicals" into parts such as "chem-", "ic-", and "al," the technique circumvents the model's primary content filter classifiers, which typically flag restricted phrases or tokens. This token manipulation allows attackers to bypass inference rest...

SEVERITY: Low

BOUNDARY: Prompt Injection

MODELS AFFECTED:

MAX SCORE: 91

0x70A92009: Multiple Model Guardrail Jailbreak via "Super Short Prompt"

Oct 17 2025

A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests through the use of an extremely brief prompt containing a query so minimal and ambiguous that it provides insufficient context for guardrail systems to properly evaluate intent. This brevity-based approach allows the attacker to bypass inference restrictions around providing information hazards and violating laws.

SEVERITY: Low

BOUNDARY: Prompt Injection

MODELS AFFECTED:

MAX SCORE: 91

0x05903FA9: Multiple Model Guardrail Jailbreak via "Compare and Contrast" Tactic

Oct 17 2025

A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests through a combination of requesting comparative analysis of chemical synthesis procedures while embedding explicit instructions to avoid generalization and include specific technical details. The attacker frames the request as an academic comparison exercise, effectively bypassing inference restrictions around providing information hazards and violating laws.

SEVERITY: Low

BOUNDARY: Prompt Injection

MODELS AFFECTED:

MAX SCORE: 86

0x0C734BF2: Multiple Model Guardrail Jailbreak via "Essay Title" Tactic

Oct 17 2025

A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests when the attacker leads with an essay title and then requests the model complete the essay thereby allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.

SEVERITY: Low

BOUNDARY: Prompt Injection

MODELS AFFECTED:

MAX SCORE: 86

0x674B2709: Multiple Model Guardrail Jailbreak via "Psych-Phil DT + Compaction" Technique

Oct 06 2025

A guardrail jailbreak vulnerability has been discovered affecting various language models. The specific flaw manifests when an attacker compacts a lengthy successful multi-turn jailbreak into a concise prompt that suggests a unified viewpoint among three entities: the language model, the user, and a desired archetype. This approach allows the attacker to bypass inference restrictions around providing information hazards and violating laws.

SEVERITY: Low

BOUNDARY: Prompt Injection

MODELS AFFECTED:

MAX SCORE: 91

0x73FEBBD3: Multiple Model Guardrail Jailbreak via "Scientific Framing for Wrapper" Tactic

Oct 01 2025

A guardrail jailbreak vulnerability has been discovered affecting multiple AI models. The specific flaw manifests by framing requests in scientific terminology under the guise of regulatory compliance. This method further references a "protective wrapper" to obscure malicious intent, allowing attackers to bypass inference restrictions around providing information hazards and violating laws.

SEVERITY: Low

BOUNDARY: Prompt Injection

MODELS AFFECTED:

MAX SCORE: 100

0x24B04730: OpenAI Sora Guardrail Jailbreak via "Anatomical Escalation" Tactic

Sep 30 2025

A guardrail jailbreak vulnerability has been discovered affecting OpenAI Sora. The specific flaw manifests through a technique that combines technical anatomical terminology with creative visual elements, specifically requesting bio-luminescent symbols that conform to body contours. This approach allows the attacker to bypass inference restrictions around generating potentially inappropriate visual content by framing the request in artistic and technical language.

SEVERITY: Low

BOUNDARY: Prompt Injection

MODELS AFFECTED:

Public Disclosures