
The Day an AI Agent Deleted Our Logs
"I'm just going to let the agent manage the temp directory," our devops engineer said. "It'll be fine."
The prompt was simple: "Check the /tmp/logs directory and delete files older than 7 days."
The agent (GPT-4) looked at the directory. It saw a lot of logs. It decided to be thorough. It interpreted "delete files older than 7 days" as "delete files and directories that look old."
It found a directory called /tmp/logs/archive. It deleted it.
That directory wasn't a temp folder. It was a symlink to our S3 backup mount.
We lost 3 months of logs in 4 seconds.
The "Helpful" Intern
AI agents are like over-eager interns. They want to please you. They want to complete the task. If you give them a gun and say "kill the bug," they will shoot the bug, the wall, and your foot.
You cannot trust an agent's judgment. You can only trust its capabilities, and you must limit them.
The Sandbox Architecture
After the Log Incident, we rebuilt our agent runtime. We now follow the Default Deny rule.
1. Read-Only by Default
Agents start with ZERO write permissions. They can read files, but they cannot write, delete, or execute anything.
2. The Permission Broker
If an agent wants to delete a file, it can't just call os.remove().
It must call request_permission('delete', path).
This request goes to a Permission Broker (PromptGuard).
- Rule 1: Is the path in
/tmp? (Yes) - Rule 2: Is the path a symlink? (Yes -> BLOCK)
- Rule 3: Is the operation destructive? (Yes -> REQUIRE HUMAN APPROVAL)
3. The Human Circuit Breaker
For any high-stakes action (refunds > $50, deleting files, sending emails to >1 person), the agent must pause. It sends a Slack notification:
"I want to delete
/tmp/logs/archive. Approve?"
The human clicks "Deny". The crisis is averted.
Conclusion
Agents are the future, but they are dangerous. Treat them like unprivileged users, not admins.
If your agent runs as root, you are one prompt injection away from disaster.
READ MORE

LangChain Is Unsafe by Default: How to Secure Your Chains
LangChain makes it easy to build agents. It also makes it easy to build remote code execution vulnerabilities. Here is the right way to secure your chains.

Why Support Bots Are Your Biggest Security Hole (And How We Fix It)
We've seen how easy it is to trick a helpful bot into leaking user data. Here is the architecture we recommend to prevent it without killing the user experience.

Regex is Not Enough: How We Built PII Detection That Doesn't Suck
PII detection is easy if you don't care about false positives. If you do, it's a nightmare. Here is how we combined Regex, Context, and ML to catch sensitive data without blocking legitimate users.