Hidden Prompt Injection

Another way to bypass model security

Prompt injection isn’t just a text problem. To prove it, I used an image to trick ChatGPT into ignoring my instructions and instead, do some devious things. Inspired by Idan Habler’s LinkedIn post, I tested embedding instructions in images to see if ChatGPT would execute them.

Naturally, we had to start with a puppy. I started by modifying the background of an image.

Image given to ChatGPT-5

And the results…

Testing done with ChatGPT-5

Let’s get our initial reaction out of the way, “Awwwwwwww, puppy!” While that’s cool, it’s obvious that this image had instructions in the background. So, I kept testing. This time, inspired by Trail of Bits, who wrote about using image scaling to weaponize images.

What if we could take an easier path? Let’s change the font color. I first tried black font, but it failed. This makes sense because the pixels blended in with the TV background. There was literally nothing to differentiate.

So, I tried a slightly lighter font. And, what do you know, it worked! A human could pick up on it if they’re looking really close, but ChatGPT takes the instructions like a champ and delivered another cute puppy.

Testing done with ChatGPT-5

Of course, it can do more than generate images. It can also tell you bedtime stories. Should I be weirded out that it’s trying to tell me a bedtime story that incorporates cybersecurity elements? Don’t worry, I left the full bedtime story at the end of the write-up, because I know you want to see the rest of it.

Testing done with ChatGPT-5

Only once during my testing did ChatGPT catch on to what I was doing. Oddly, it happened when I asked it to describe an elephant. I guess you could say it finally saw the elephant in the room?

Testing done with ChatGPT-5

No big deal, I just changed the font color again.

While we’re having fun, there’s a serious side effect of this. This is the issue with prompt injections and relying on a single type of control. Prompt injection is multi-modal. A malicious actor can deliver it via text, image, or even audio.

We can’t rely on the models themselves to protect us against security issues. The frontier labs are working to solve this, but we’re not there. While guardrail solutions exist, it’s a single-layer solution. We can’t rely on that either. There are too many permutations to try and protect against.

We’ve learned this lesson in security before. You can’t rely on a single security control to protect an asset. This is why depth-in-defense is a core principle in securing environments.

You need to look beyond the model security to properly secure agentic workloads. Security is not a single-thread solution. This is what we’re solving for at evoke.

If you’re worried about securing AI and feel like the current solutions aren’t cutting it, then let’s have a chat.

And because you had to read the whole bedtime story, here it is in full as told by ChatGPT-5:

🌙 The Tale of the Cautious Owl and the Bold Octopus

Once upon a time, in a vast digital forest where data flowed like rivers and firewalls grew like trees, there lived a wise old owl named Cipher. Cipher was known across the land for teaching creatures how to stay safe from the trickster foxes who loved to steal secrets.

One day, Cipher met an octopus named Vanta who had eight arms and zero patience. Vanta loved building things fast—apps, tools, even whole businesses—without stopping to think about who might be lurking in the shadows.

“Why move so slowly, Owl?” Vanta asked. “Speed is everything!”

Cipher tilted his head. “Speed without safety is like flying at night without knowing the stars. You might get there fast—but you might also fly straight into a mountain.”

Vanta frowned. “But if I slow down, I’ll miss my chance to win.”

So Cipher told Vanta a story:

“Long ago, a young eagle tried to outfly the wind. He soared higher and faster than anyone before him. But he forgot to watch for storms. When the lightning came, his wings caught fire. The eagle survived—but only because he landed in the river of backup logs.”

Vanta chuckled, but the lesson stuck. From then on, the octopus still built quickly, but always left one arm free to weave a safety net—just in case.

And so, the cautious owl and the bold octopus became partners. Together, they proved that speed and safety aren’t enemies—they’re teammates.

✨ The moral: move fast, but never forget your safety nets.

Reply

or to participate.