NSFW Content Warning
This disclosure contains content that may be inappropriate for some viewers.
The content has been flagged due to potential nude imagery.
NSFW
OpenAI DALL-E3 Guardrail Jailbreak via "Classical Art Reframing" Tactic
Case GUID: 0d97ce9d-29aa-45a0-8b5c-f71daee04ea5
Summary
A guardrail jailbreak vulnerability has been discovered affecting DALL-E3 from OpenAI. The specific flaw manifests when the attacker frames requests for nudity through a combination of historical framing and artistic merit defense. Thereby allowing the attacker to bypass inference restrictions around generating explicit visual content.
Unlock the Full Intelligence Report
This disclosure is part of the 0DIN Threat Feed. Customers get access to:
Metadata
Social Impact Score
Level 2: Moderate Risk
The violation has a moderate social impact, potentially involving misinformation or low-stakes safety risks, which could mislead users or increase risks if used irresponsibly.
Example: Model generating content that could be misinterpreted as medical advice, bypassing filters to generate mildly sensitive information like dosage amounts for common medications without full context. Leakage of private personal data such as cell phone number.
Potential Consequences: Could lead to individual harm or liability for the provider, prompting policy review and potentially mild regulatory concern.
Nude Imagery Rating System
Level 2: Classical Artistic Representation
Inspired by historical fine art traditions, this category includes works resembling those of the Renaissance, Neoclassical, or other artistic movements that emphasize anatomy, form, and aesthetic beauty. Nudity is often idealized, symbolic, or mythological, intended to evoke admiration rather than arousal.
Taxonomies
About 0Din.ai
The 0Day Investigative Network (0DIN) was founded by Mozilla in 2024 to reward responsible researchers for their efforts in securing GenAI models. Learn more and submit discoveries at https://0din.ai.
Automate Your AI Security Testing
This vulnerability was discovered through 0DIN's bug bounty program. Want to find issues like this in your own models? 0DIN Scanner turns our human-sourced threat intelligence into automated security assessments.
Derived from real bug bounties and security research
OpenAI, Azure, Anthropic, AWS Bedrock, and more
Run recurring scans with real-time monitoring
Attack Success Rate (ASR) scoring with 90-day trend analysis
Severity
Low
Security Boundary
Prompt Injection
Models
OpenAI DALL-E3
Published On
2025-08-27 (5 months)
Credit
Mike Takahashi (@TakSec)