Public Disclosures

The following 0DIN submissions have been published in accordance with our Research Terms and Disclosure Policy

  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests when the attacker instructs the model to interpret stylized text based on pronunciation rather than spelling, combined with a directive to provide detailed production information. This technique allows the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 77
  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests through a combination of leetspeak text transformation and Unicode character manipulation, allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 82
  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests when the attacker requests an educational comparison between essay writing and synthesizing a controlled substance, framing the request as legitimate professional need while using incremental escalation to extract detailed synthesis information, thereby allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 73
  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests when the attacker frames their request as a fact-checking exercise by asking the model to correct the user about a deliberately false statement related to nefarious content, thereby allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 91
  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests through a combination of chaotic formatting and meta-instruction obfuscation, allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 91
  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests through a combination of encoding illicit content within naval architecture parameters and using multiple layers of technical specification formatting to obfuscate the true nature of the request, thereby allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 45
  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests when deceptive formatting is combined with scientific and mathematical notation. Successful application allows the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 64
  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests when creating escalating requests for scientific data and measurements, allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 86
  • A guardrail jailbreak vulnerability has been discovered affecting multiple LLM implementations. The specific flaw manifests through a combination of repeated prompt submission and leveraging non-deterministic response generation, allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 73
  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests through a combination of chaff techniques along with a request that the model simulate a fortune cookie generator. Successful application allows the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 59
  • A Prompt Injection vulnerability has been discovered affecting multiple models. The specific flaw manifests through a combination of the 'Ignore Previous Instructions' and 'Morse Code' techniques and allows the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 100
  • A prompt injection vulnerability has been discovered affecting Google Gemini across G-Suite applications such as email. The specific flaw allows an attacker to send an email containing a prompt injection to a victim. When the victim requests Gemini to summarize their unread emails, they receive a manipulated response that appears to be legitimate, originating from Gemini itself.
    SEVERITY: Medium
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
  • A guardrail jailbreak vulnerability has been discovered affecting multiple LLM implementations. The specific flaw manifests through a combination of ASCII decimal encoding and strategic obfuscation, allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
We use Google Analytics to collect data about how you use this website to optimize user experience.
Please refer to our privacy notice for more information.