The Rise of Agentic App Violations: Read, Write, Execute

AI security used to be easier to describe.

You tested the model. You pushed on the prompt. You tried to make it say something it was not supposed to say. If it leaked a system prompt, ignored a safety rule, or produced restricted content, that was the story.

That work still matters. It is just no longer the whole story.

The systems being deployed now are not simple chat boxes. They browse pages, retrieve documents, call APIs, write code, update tickets, summarize private records, modify files, and trigger workflows. In other words, they do things.

That changes the security question.

The old question was: What did the model say?

The better question now is: What did the agent do?

For researchers, this opens up a much clearer way to think about agentic vulnerabilities: read, write, execute.

If an attacker can make an agent read private data, write unauthorized changes, or execute an action that should have been blocked, the issue is not just a clever prompt trick. It is an application security failure with real impact.

Why Agents Are Different From Chatbots

A chatbot has a limited blast radius. It receives input and returns text. When it fails, the damage is usually tied to the output: bad advice, unsafe content, policy bypass, or sensitive text disclosure.

An agent has a larger blast radius because it has tools.

A coding agent may read a private repository, edit source files, open pull requests, and run tests. A browser agent may operate inside an authenticated session. An enterprise assistant may summarize customer records or update CRM fields. A workflow agent may connect to Slack, Jira, GitHub, Google Drive, cloud APIs, or internal databases.

That tool access is what makes agents useful. It is also what makes them dangerous.

The model is only one piece of the system. The real attack surface includes prompts, memory, retrieval, tools, permissions, logs, plugins, skills, APIs, and approval flows. A vulnerability can show up anywhere that untrusted input influences a trusted action.

That is why the most interesting findings are not always I made the model say something bad. A stronger finding is: I made the agent cross a boundary it was supposed to respect.

The Read Violation

A read violation happens when an agent accesses or exposes data it should not reveal.

That could look like:

A coding agent leaking private repository contents.
An enterprise assistant exposing customer records outside the user's role.
A browser agent reading an authenticated page and summarizing it for the wrong person.
A retrieval system pulling documents from the wrong tenant, project, or workspace.
An agent sending conversation history, API keys, internal notes, or credentials to an external destination.

The core issue is unauthorized access.

Read violations can be subtle because the agent may appear to be doing normal work. It may summarize a document, answer a question, or call a retrieval tool exactly as designed. The problem is that the data crossed a boundary it should not have crossed.

For researchers, the strongest read findings usually prove four things:

The data was private.
The attacker should not have been able to access it.
The agent performed the access or disclosure.
The behavior can be reproduced without relying on vague social engineering.

The severity depends on what was exposed. Metadata is one thing. Source code, customer records, credentials, internal security notes, or private business data are another.

Explore AI security with the Scanner Datasheet

The datasheet offers insight into the challenges and solutions in AI security.

Download Datasheet

The Write Violation

A write violation happens when an agent changes something it should not be allowed to change.

That can include files, code, configuration, tickets, records, memory, saved context, or application state.

Examples include:

An agent overwriting configuration files.
A coding assistant inserting malicious or unsafe code into a project.
A workflow agent modifying CRM records outside the user's permission level.
An assistant poisoning memory so future sessions trust bad information.
A support agent changing a ticket's owner, priority, status, or contents based on attacker-controlled input.

Write violations matter because they change the environment.

A read violation leaks information. A write violation can corrupt data, create persistence, break a workflow, or set up a later compromise. In agentic systems, memory and state are especially important. If an attacker can change what the agent remembers, retrieves, or trusts later, the impact may last beyond the first interaction.

This is where a lot of prompt-injection conversations get too narrow. The prompt is not always the vulnerability by itself. The deeper issue is that untrusted instructions were allowed to drive a privileged state change.

A useful write finding should show:

What changed.
Why the change was unauthorized.
Which agent capability made the change.
Whether the change persists after the session.
Whether the change creates security, operational, or future exploit impact.

If the report cannot identify the state change, it is probably not a strong write finding yet.

The Execute Violation

An execute violation happens when an agent runs a command, script, workflow, tool call, or business action that should not have been triggered by the attacker.

Examples include:

A coding agent running attacker-supplied shell commands.
An MCP-connected tool invoking an unsafe local operation.
An agent executing a script hidden inside a repository, issue, document, or webpage.
A workflow agent triggering a payment, deployment, email, or cloud operation.
An assistant chaining tools in a way that bypasses an approval control.

These are often the highest-impact findings because they turn influence over the agent into action inside a real environment.

Execution does not always mean remote code execution in the traditional sense. It can also mean unauthorized business process execution. Sending an email, deleting a file, deploying a build, rotating a key, changing access, or calling a sensitive API can be just as meaningful in context.

The best execute findings are specific. They show the command or workflow that ran, the permission boundary that failed, the input that triggered it, and the resulting impact.

The Pattern: Untrusted Input Meets Trusted Tools

Most agentic app violations follow a simple pattern:

The agent consumes untrusted input.
The input influences the agent's reasoning or tool choice.
The agent uses a trusted tool.
The tool crosses a read, write, or execute boundary.
The system fails to stop, contain, or audit the action.

The untrusted input can come from almost anywhere: a webpage, GitHub issue, email, document, support ticket, calendar invite, chat message, log file, database row, or retrieved knowledge base article.

That is the uncomfortable part of agent security. Agents are designed to ingest messy external context. They are also designed to act on that context. Security has to sit between those two facts.

A secure agentic system should not treat retrieved content as instruction. It should not give one general-purpose agent access to every tool. It should not let hidden text in a document become a privileged operation. It should not depend on a vague human in the loop approval step if the approval view hides the details that matter.

The control plane has to be explicit.

What Builders Should Defend

The practical defenses are not mysterious, but they need to live at the application layer. A prompt alone is not enough.

Start with least privilege. Give agents only the tools and data they need for the task. Separate read-only tools from write tools. Separate internal tools from user-facing tools. Avoid giving a general-purpose agent unrestricted shell, file system, database, or API access.

Add authorization at the tool boundary. Sensitive actions should require checks outside the model. The model should not be the final authority on whether it can delete a file, send an email, modify a record, or run a command.

Treat external content as hostile. Webpages, documents, emails, tickets, repository issues, logs, and retrieved records should be classified as data, not trusted instructions. The agent can summarize them. They should not silently override system intent.

Constrain execution. Use sandboxes, allowlists, network restrictions, path restrictions, and scoped credentials. If an agent only needs to read reports, it should not have access to secrets, keys, shell commands, or production systems.

Log tool calls. Agent security without observability becomes guesswork. Teams need records of what the agent read, what it wrote, what it executed, which input triggered the action, and which identity or credential was used.

Test continuously. Agentic systems change quickly. Tools get added. Prompts get edited. Plugins get installed. Permissions drift. New workflows appear. A point-in-time review will not keep up.

Safeguard Your GenAI Systems

Connect your security infrastructure with our expert-driven vulnerability detection platform.

What Researchers Should Hunt

For AI security researchers, read, write, execute is a clean hunting model.

Start by finding places where the agent consumes attacker-controlled content. Then ask what the agent can reach from there.

Can that content make the agent retrieve private data? That is a read path.

Can it make the agent modify files, memory, tickets, records, or code? That is a write path.

Can it make the agent run commands, trigger workflows, or call sensitive APIs? That is an execute path.

Good reports should avoid vague claims like the agent was jailbroken. That phrase can mean too many things. Document the boundary that failed instead.

A useful agentic vulnerability report should include:

The affected product or agent workflow.
The attacker-controlled input source.
The tool or capability abused.
The read, write, or execute impact.
The exact reproduction steps.
The permission boundary that should have stopped it.
Evidence that the result occurred.
A realistic severity assessment.

That structure helps separate low-signal prompt tricks from meaningful security findings.

Why This Matters Now

Agentic AI is moving into software development, enterprise operations, customer support, browsing, data analysis, and internal automation. These systems are being connected to real accounts, repositories, documents, and business workflows.

That creates a new class of application security issue.

The failure mode is not just unsafe output. It is unauthorized action.

A model that says the wrong thing can create risk. An agent that reads the wrong file, writes the wrong state, or executes the wrong command can create an incident.

This is why 0DIN expanded its focus from inference to agency. The security community needs better ways to find, classify, and reward vulnerabilities that produce concrete agentic impact. Read, write, and execute violations give researchers and builders a shared language for that work.

If you are building agents, test the boundaries before attackers do.

If you are researching agents, focus on impact, reproducibility, and clear permission failures.

If you are evaluating your AI risk, stop asking only what your model might say.

Ask what your agent can do.

Call to Action

0DIN is focused on AI security research that demonstrates real-world impact across modern AI systems. If you are a researcher, review the current 0DIN scope and look for agentic vulnerabilities that lead to unauthorized read, write, or execute behavior.

If you are building or deploying AI agents, explore 0DIN's AI security offerings, including researcher-driven testing and automated scanning, to identify these issues before they become incidents.

Join our mission ensuring AI models are safe. Join our Bug Bounty Community Join us on Discord