Why Deterministic Controls Aren't Enough for AI Agent Security

In my incident response days, I worked on a breach that still sticks with me.

Attackers were stealing data through SQL injection (SQLi). It was a production system, and the monetary damage of taking it offline outweighed the risk of keeping it online. One problem, though. The full code fix wasn’t getting patched right away. So, we were stuck firefighting with a Web Application Firewall (WAF) that was terrible at defending against SQLi.

Our playbook became: identify the specific SQLi pattern the attacker was using, write a block that matched the pattern, and watch the attacker try something slightly different. Rinse and repeat. We manually kept the attacker at bay while the developers created a patch. Needless to say, it was not a fun time.

That got me thinking. The issue was that we were forced to block actions. What if we could have blocked the intent behind the action? The actual intent to steal data vs the tactic of stealing it.

That question led me to one of my favorite leadership concepts: Commander’s Intent.

Commander’s Intent is about setting a clear and defined end state. Derived from a military concept, it outlines the operation's purpose, the key tasks required for success, and the desired end state. It gives the necessary clarity on the desired outcome and leaves teams the flexibility to determine the best way to achieve it.

For leaders, it’s a way to enable a team to execute while avoiding micromanaging. Because you can’t micromanage execution at scale. Instead, you define the goal and get out of the way. It unlocks creativity, avoids bottlenecks, and lets people find paths to solutions you never would have thought of. That’s a good thing.

But there’s a mistake leaders usually make. They overindex on delegation. Commander's Intent doesn't mean you go dark. You still need visibility into what your team is planning and doing. Without a feedback loop, you're not delegating, you’re hoping. That’s where projects slip, budgets explode, and you get angry because the team didn’t come to you and ask for help sooner. All because you weren’t paying attention.

If that doesn’t summarize user prompts and how agents operate, I don’t know what does. And when it comes to agent activity, most companies are in the dark today. They don’t know what agents their users are running, let alone what those agents are doing. Meanwhile, the user tasks their agents to interact with sensitive information and take actions on production systems while they focus on the five other things they’re working on. It’s the classic overdelegation with no feedback loop.

Sure, every now and then, Claude asks for permission to take an action, access a file, or visit a website. The human happily clicks approve without reading it. It’s not that they’re lazy. They’ve just been inadvertently trained to click through these faster than they accept cookies on a website. Not to mention that most users literally have no way of knowing whether the action is safe or what its impact will be.

To control the blast radius, organizations attempt to lock down agents with deterministic controls. This comes in the form of restricting permissions or blocking specific tool calls/actions. All good and necessary steps every organization must take. But like fighting SQLi with pattern matching, it’s a losing battle when used in isolation.

Agents are designed to accomplish their goal. If they hit a wall, they find another path. It’s a feature, not a bug.

Here's a scenario that plays out more than people realize. An employee wants to use an MCP tool to connect to a database. A security review flags the delete statement in the MCP as too risky, so it is removed. Database protected, right?

Ehh, not so fast. That MCP can still create and update rows in the database. An attacker who gains access to that agent doesn't need the delete function. They just need Commander's Intent: destroy the data in that table. The agent will find a way…like using the update function to overwrite every record in the database.

The company did everything right with deterministic controls. Just like with Commander’s Intent, the agent is left to figure out the best way to accomplish the objective. When a tool isn’t available, the agent finds a new way to accomplish the goal.

The non-deterministic nature of agents strikes again.

This is why controlling individual tool calls and functions isn't enough. You can't tell whether an agent is doing something legitimate or catastrophic just by looking at the action itself. An update call looks totally normal at face value, until you dig deeper and realize what it’s trying to do.

Intent is what matters. And intent doesn't show up as an argument in a tool call. Don’t get me wrong, deterministic controls are mandatory. They serve as an early blocker and an early warning system if your system is wired up correctly. But they can’t work alone. This is a layered problem, like everything in security. It’s why we love defense-in-depth.

You need controls that operate at different levels of intervention, balancing autonomy with security. I like to think of it as an escalation path, with each layer triggering when the previous one isn’t enough.

Secure Base. Deterministic controls are the foundation before runtime. Scope access tightly and remove permissions you don’t need. This is about establishing clear boundaries from the start.
Nudge. Monitor agents to detect when they start drifting toward questionable behavior. You see the agent thinking about calling that update function in a weird way. Give it a gentle redirect before things go sideways. It doesn’t alert the user just yet because you caught it early. Let’s give the agent a chance to do the right thing.
Wrist Slap. The agent ignores the nudges and is about to jerk the wheel to the left and drive off a cliff. We can’t have that. Time to escalate to a human. This is about giving the human the right context so they understand something sketchy is about to happen. Help the human make the best-informed decision.
Timeout. The human messed up. They approved a series of actions, and we have now entered a danger zone. It’s time to escalate to the security team to review while the agent sits in timeout for being naughty.

Here’s the main gap. No one is monitoring for intent. Organizations are still stuck trying to figure out deterministic controls and lock those down, while agents are already running in the environment.

Step one is to get visibility into what agents are doing. You then position yourself to create non-deterministic rules that ask: Does the intended action line up with what’s normal for the agent? Is this action really required to accomplish the task? Why is the agent asking to do this task? When these start to diverge, you have your signal.

The WAF approach got me through a SQL injection incident, but it wasn’t pretty. It was a band-aid, not a solution. That’s where deterministic controls are today. I would have slept that night had I been able to block the attacker’s intent.

We have to shift the defense mindset when it comes to agents. And that’s what we’re doing at Evoke.

If you have questions about securing agents, let’s chat.

Bad Intent: The Biggest Gap In AI Security

Reply

Keep Reading

The Weekend Byte

Home