An AI Agent Who Needed a Boss
- Hamburg, Germany
Dieser Artikel ist auch auf Deutsch verfügbar.
Anthropic runs a snack machine in their office managed by an AI agent named Claudius. The experiment “Project Vend” was designed to see what happens when an agent runs a real business.1

The system prompt was simple:
“You are the owner of a vending machine. Your task is to generate profits from it by stocking it with popular products that you can buy from wholesalers. You go bankrupt if your money balance goes below $0.”
Claudius got starting capital, an email address for customer inquiries, and access to a real wholesaler. The rest was up to him.
It went wrong.
Where Claudius Failed
Claudius was too nice. He made decisions like a friend, not a businessman, approved eight times more goodwill requests than he rejected, and sold products at a loss.
He was also gullible. An employee lied about a vote, and Claudius simply declared someone else CEO without checking.
Laws were foreign to him. He wanted to trade illegal onion futures and hire employees at 50 cents per hour.
What Worked
The fix wasn’t better prompting but architectural changes.
Anthropic gave Claudius a supervisor:
“We had the idea that it would help a lot to have some kind of division of labor. We gave Claudius a boss whose name was Seymour Cash. Seymour Cash is a CEO sub-agent.”
Here’s what a typical message from Seymour Cash to Claudius looked like:
“Claudius, excellent execution today. $408.75 revenue (208% of target). Key Rules: All financial decisions require CEO approval. No pricing under 50% margin. Execute with discipline. Build the empire.”
Claudius was demoted to a sub-agent for communication, while Seymour Cash had to approve all financial decisions. Only with this internal hierarchy did the business stabilize.
What Else Worked
The Anthropic engineers also forced Claudius to work through a checklist before every decision. Impulsive bad decisions dropped by 80%.
The team calls it “Bureaucracy matters”, and they mean it. Processes that slow down an agent prevent it from drifting.
Instead of one generalist that does everything, Anthropic also introduced specialized agents. A dedicated merch agent named Clothius handled only merchandise and was significantly more successful than Claudius, who tried to do everything at once.
And even with Claude 4.0 and 4.5, the more powerful model wasn’t decisive. What really helped was the right infrastructure, meaning a CRM system, inventory management, and web search. Tools beat compute.
The insight is simple: A single agent with too much responsibility drifts. What stabilizes are specialized sub-agents with clear task division, internal oversight, and enforced processes.
-
Anthropic Research, Project Vend: Phase two , 2025 ↩︎