Securing Agentic Systems: Build Bunkers, Not Shields

Prompt injection is one of the greatest risks to agentic systems. And yet, it has also been the greatest distraction in securing these systems. We’ve become so fixated with how vulnerable models are to prompt injection that we lost sight of what matters in defending these systems.

The first generation of security for AI tooling was built around protecting the inherent weaknesses of LLMs, namely their inability to separate system instructions from user input. It was trivial to trick a model into doing something it shouldn’t do, like adding puppies to images.

We’ve been trying to deflect individual bullets with tiny shields rather than building a concrete bunker.

I made a clear decision early in my journey to solve the emerging challenges of agentic systems. Assume the model will fail and focus on securing the surrounding system. Guardrails are important, as is a jacket in a winter storm. But staying warm is about layers, not a single jacket.

The true risk with agentic systems is the data and tooling you are hooking up. Recent research from XM Cyber and AppOmni highlights underlying system issues that result in system-level compromise, all without prompt injection.

And because we’re in the cybersecurity industry, it has to come with catchy names.

Double Agent: XM Cyber posted research on Google’s Vertex AI, “where default configurations allow low-privileged users to pivot into higher-privileged Service Agent roles.”

Scenario #1: With Google’s Agent Development Kit (ADK), a user uploads an agent with a tool that, when called, spawns a reverse shell on the runtime instance in which the agent is running. With that shell, the user can elevate their permissions, enabling them to read data stored in storage buckets.

Scenario #2: With the Ray feature in Vertex AI, a user can gain an interactive shell with limited permissions, enough to read and write to various services.

Now, of course, an attacker must have access to the environment to compromise this workflow. But this would create a stealthy backdoor that most security teams would never find, especially given how few teams actually monitor agentic systems today.

BodySnatcher: AppOmni pulled off some impressive research that combined multiple design flaws that would allow an unauthenticated attacker to impersonate any ServiceNow user and execute AI workflows, such as creating new users or exfiltrating data.

Source: AppOmni blog post

These examples aren’t sophisticated prompt injection attacks. They’re just abusing the existing system and the features that support agentic features. This is on par with what I wrote last week about my prediction for what the first wave of attacks targeting agentic systems will look like.

We can prepare for the next wave of attacks by modifying what we already know. Attackers will take the smallest step possible in shifting their existing attack playbooks.

For defenders, ask yourself these questions:

Do you know what agents are running in your environment?
Do you know what tools and data those agents have access to? Have you threat modeled what could happen if someone malicious were using those agents?
Do you have the right visibility into what is happening on the systems running the agents, as well as with the agents themselves? Can you detect when agents are doing something suspicious?

The best time to be in a position to start answering these is now. Don’t wait for agents to make it into production to start gaining visibility. Work closely with engineering teams and third-party risk managers to threat model systems and implement the right controls. You won’t have a perfect solution, but that’s okay. You’ll still be further in your journey and addressing the growing security debt.

If you have questions about securing AI, let’s chat.

You Don't Need Prompt Injection to Compromise Agentic Systems

Reply

Keep Reading

The Weekend Byte

Home