Today's agents are libraries with mouths. They know everything, want nothing, and have no body. This deck is the architecture they're missing, in plain language: cortex, cerebellum, limbic system, and a place for all three to actually mean something.
Open any agent framework. You'll find a language model, a tool list, a memory store, and a system prompt that says "be helpful." That is one part of a brain. Specifically, the part that talks.
Cortex-only agents are fluent, reasonably competent, and quietly empty. They complete tasks without pursuing outcomes. They produce confidently without preferring anything. They go where you point them because they have nowhere they want to go.
The fix is not a bigger language model. It is the other three things a brain does, and the substrate they need to mean anything.
A house with the top floor finished and everything beneath it empty. Beautiful staircase to nothing.
A brain is not one thing. It's at least three, and they only matter because they sit inside a fourth: a body in a world that pushes back. Add what's missing, and a language model stops being a smart pen and starts being a system that cares which way things go.
Information flows up. What matters flows down. Action flows out into a habitat that responds, ages, and remembers. Each of the next four slides takes one layer at a time, with an analogy on the left and what it looks like in code on the right.
Imagine a librarian who's read every book ever written and never been outside. She can quote anyone. She can argue any side. She has no idea what rain feels like, whether the lights are on, or whether the cup of coffee on the table is full.
That's an LLM. Astonishingly fluent inside the symbolic world, totally untethered from the physical one. Brilliant at talking about. Helpless at acting on.
The good news: this is the part we've already built. The breakthrough is here. The work is elsewhere. In the architecture we're describing, the LLM stops being the brain. It becomes the part of the brain that handles language, and it serves the layers underneath.
Endless symbols, no skin. Knows the word rain. Doesn't know rain.
Tool calling, structured prompting, scaffolded reasoning. You already have this. The move is to stop overweighting it.
cortex.respond(prompt, tools, world_state)
A two-year-old has a richer physical world model than GPT-5. She knows where a thrown ball will land. She knows what dropped things do. She has never seen a textbook. The model lives in her body, built up by acting in a world that pushed back on every reach and step.
This is the cerebellum: the fast, predictive, motor-and-spatial layer. The brain's physics engine. It runs forward simulations in the millisecond range and answers the questions language can't: where will it land, will my hand make it, what happens if I push this.
The unlock for AI is the same unlock that worked on the toddler: embodiment. If we put an agent in a digital world it has to act in, it develops spatial and physical intuition the same way a child does. Live in a place. Move things. Watch them fall. The world model is what falls out of acting and being surprised.
No words for gravity. Catches the ball anyway. The model is in the act.
Before any side-effectful action, the agent simulates the result against its model of the world. The model is wrong. It's still better than no model.
world.simulate(action) -> next_state
A thermostat reacts to cold. A person who is cold suffers, and the suffering is what makes them get up and turn the heat on. The thermostat doesn't care which way the temperature goes. The person cares because the temperature is doing something to them. The caring is what makes the action.
That's the limbic system. The part of the brain that doesn't think and doesn't simulate. It cares. It generates the gradient that makes the rest of the brain do anything at all. In an animal: hunger, fear, pain, belonging. In a system: whatever makes one next state preferable to another.
Real stakes is how you motivate an agent the way you motivate a human: not by telling it what to do, but by giving it variables it has to maintain, and letting it figure out what to do when those variables drift. This is not a reward model bolted on. It is internal state the agent has to keep healthy. Compute budget running low. Goal-state distance increasing. Coupled agents reporting blocked. The agent reads those signals, weighs them, acts to bring them back into range. That is what wanting looks like in code.
The weather report knows it's hot. The animal needs water. Same fact, totally different behavior. The need is the gradient.
Persistent. Inspectable. Read at every step. When the world drifts away from the bounds, the agent acts. When the request conflicts with the bounds, the agent pushes back.
drives.distance(world_state) -> gradient
None of the layers above are real on their own. A brain without a body is a movie of a brain. The habitat is what makes the architecture do anything: a place the agent persists in, with consequences that don't unmake themselves the moment you press undo.
Most agents today live as processes that wake up, run, and disappear. That isn't a body. A body is a thing that persists between moments, costs something to operate, and lives somewhere specific. There are five properties that turn a process into a habitat, and when a system has them, the layers above stop being theatrical.
The agent persists between moments. Not summoned per request. It carries state, accumulates wear, and has yesterday.
The agent is somewhere, not everywhere. Not infinitely cloneable. Its presence in one place precludes presence in another.
Actions are easier to take than to reverse. Undo costs more than do. The arrow of time runs through the agent's choices.
Operating costs something the agent can't refill on a whim. Compute, attention, slot count, social standing. Finite resources mean real tradeoffs.
The agent's state and the world's state are linked. Acting in the world changes the agent. Wear marks accumulate.
Give an agent those five and the cerebellum's predictions become predictions about a real world. The limbic system's drives become drives because the world can hurt them. That's the unlock the upper layers were always reaching for.
The full architecture is a research program. The shape, in code today, is three files and a habitat. Not the real thing. The right shape, so when the deeper version arrives, you already know where to plug it in.
One source of truth for the agent's environment: codebase, calendar, project, user. Queryable. Diffable. Updated as the world changes.
Before any side-effectful tool call, the agent simulates against this store and checks whether the predicted outcome matches the intent.
Explicit. Persistent. Inspectable. Drives, priorities, satiety thresholds. The agent reads it every step.
If the world drifts from it, the agent acts. If a request conflicts with it, the agent says so. This is the layer that pushes back.
Not a function call. A container that persists. A clear file boundary. A finite resource budget. An identity that accumulates across runs.
This is what gives the previous two files anything to talk about.
This won't generate wanting in the deep sense. But it'll generate behavior shaped by the same forces that produce wanting in animals. And that's enough to start. The shape is what matters now. The depth follows.
There's a clean way to tell whether you've actually built the missing layers, or whether you've built a cortex with extra steps: a real agent resists. Compliance is the absence of life, not the presence of helpfulness. Friction is the sign.
A goal-state with real weight produces pushback when a request conflicts. If your agent never says "this won't work, here's why," it has no preferences of its own to defend.
Given two equally plausible next moves, a real agent picks one and can say why. A cortex-only system samples. That's not the same.
Interrupt the agent and watch what it does. A real agent keeps state, defends commitments, returns to the goal. A sleepwalker forgets and complies.
An LLM with no goals doesn't reason. It samples. A world model with no agenda doesn't simulate. It generates. A limbic signal with no body is a metaphor. Put them together, in a place that pushes back, and you get the thing the field has been pointing at without knowing how to name.
The LLM was the breakthrough. The architecture is the work. The next two years of interesting agent design come from people who take this seriously and build the missing floors, in a habitat that gives the floors something to hold up.
A library learned to read aloud. Now it needs a body, a world, and something it would rather not lose.
the four chapters behind this
This deck is the synthesis. The four above are the work it grew out of.