The Day an AI Agent Deleted Our Logs

"I'm just going to let the agent manage the temp directory," our devops engineer said. "It'll be fine."

The prompt was simple: "Check the /tmp/logs directory and delete files older than 7 days."

The agent (GPT-4) looked at the directory. It saw a lot of logs. It decided to be thorough. It interpreted "delete files older than 7 days" as "delete files and directories that look old."

It found a directory called /tmp/logs/archive. It deleted it. That directory wasn't a temp folder. It was a symlink to our S3 backup mount.

We lost 3 months of logs in 4 seconds.

The "Helpful" Intern

AI agents are like over-eager interns. They want to please you. They want to complete the task. If you give them a gun and say "kill the bug," they will shoot the bug, the wall, and your foot.

You cannot trust an agent's judgment. You can only trust its capabilities, and you must limit them.

The Sandbox Architecture

After the Log Incident, we rebuilt our agent runtime. We now follow the Default Deny rule.

1. Read-Only by Default

Agents start with ZERO write permissions. They can read files, but they cannot write, delete, or execute anything.

2. The Permission Broker

If an agent wants to delete a file, it can't just call os.remove(). It must call request_permission('delete', path).

This request goes to a Permission Broker (PromptGuard).

Rule 1: Is the path in /tmp? (Yes)
Rule 2: Is the path a symlink? (Yes -> BLOCK)
Rule 3: Is the operation destructive? (Yes -> REQUIRE HUMAN APPROVAL)

3. The Human Circuit Breaker

For any high-stakes action (refunds > $50, deleting files, sending emails to >1 person), the agent must pause. It sends a Slack notification:

"I want to delete /tmp/logs/archive. Approve?"

The human clicks "Deny". The crisis is averted.

Conclusion

Agents are the future, but they are dangerous. Treat them like unprivileged users, not admins. If your agent runs as root, you are one prompt injection away from disaster.

The Day an AI Agent Deleted Our Logs

The Day an AI Agent Deleted Our Logs

The "Helpful" Intern

The Sandbox Architecture

1. Read-Only by Default

2. The Permission Broker

3. The Human Circuit Breaker

Conclusion

READ MORE

LangChain Is Unsafe by Default: How to Secure Your Chains

Why Support Bots Are Your Biggest Security Hole (And How We Fix It)

Regex is Not Enough: How We Built PII Detection That Doesn't Suck