Field Manual / Vol. II

How my agents differ.

A standard agent completes the task.

A studio agent completes the task and writes back what it learned.

The operator promotes what's worth keeping. The next agent reads it.

That loop is the whole difference.

Baseline

A librarian with four tools.

An agent today is four things: an LLM, a shell, a file system, and a scheduler. One librarian, three tools. The files inside the file system — notes, state, scripts, logs — are what the librarian reads and writes.

The librarian is fluent. Every book read. Never been outside. Symbol-fluent, pattern-matching at speed. That's the breakthrough.

Each session begins cold. Each task is independent. There's no place to learn in. No critique loop. No doctrine that gets sharper. The librarian forgets what they did yesterday, runs the next errand, forgets that too.

People are trying to fix this from inside the four-thing model. Write memory files on wake. Re-read them. Update them before sleep. It works — but it's still cortex. A bigger librarian writing more notes to herself. The notes are in her voice. She wrote them, she reads them. There's no editor.

Beautiful staircase to nowhere.

1LLM. Symbol-fluent. Pattern matching at speed. 2Shell, file system, files. Tools the librarian uses. 3Scheduler. Wakes the LLM on a cron or trigger. 4No memory across sessions. No critique. No doctrine.

Studio

What I added.

Ten agents, each running as a daemon with its own identity, memory, dreams, and clock. Wired into a critique loop that compounds into doctrine, fed by a practice surface that lets the studio fail in public every day.

01 / Identity

Ten defended specialists.

Quinn knows she's Quinn. Zara won't write Declan's copy. The personas don't bleed. Nine in-repo agents — Quinn, Zara, Deter, Rowan, Declan, Felix, Scout, Archie, Doctor — plus Sable, a bridge agent in a sister project. Each has IDENTITY.md + SOUL.md anchoring who they are.

02 / Continuity

Yesterday on disk.

Each agent writes a journal.jsonl per call (first-person reflection) and a rolling carry_forward.md (the brief they leave themselves to find next time). Yesterday-me writes what tomorrow-me reads. No more cold starts.

03 / Awake

Each agent its own daemon, its own clock.

Quinn, Scout, and Archie always were. The other six — Zara, Rowan, Deter, Felix, Declan, Doctor — now run as launchd-scheduled daemons too. Own plist, own tick, own heartbeat. The agent doesn't wait to be called; it looks around when its clock fires.

04 / Dreams

Daily consolidation cycle.

06:30 — dream-consolidate runs across the studio. 06:32 — dream-publish. 06:35 — dream-tonight queues what to chew on. Each agent has its own dream.md. Two weeks of scheduled-run logs already on disk.

05 / Specialization

Ten agents, ten gates.

Quinn orchestrates. Zara reviews visuals (SHIP / REVISE / KILL). Deter checks accuracy (PASS / FAIL). Rowan sets strategy. Declan writes copy. Felix watches frontiers. Scout explores references. Archie patterns critiques. Doctor watches health. Sable bridges to a sister project. Each refuses work outside its gate.

06 / Critique → doctrine

Observations after the work. Promotions after the operator.

Every meaningful run produces a critique — observation, not directive. Archie patterns critiques across runs. When the same observation lands twice (n≥2), it's a candidate. The operator reviews. Approved → axiom · heuristic · rule · pov · anti-pattern. The next agent reads the new doctrine.

The operator gate is training. As individual agents prove they can pattern accurately, more of the promotion path moves to them. The studio is built to need less operator each round.

07 / Testing ground

we-play, where critiques flow daily.

The studio's practice surface. Pieces get attempted, reviewed by Zara, gated by Deter, refused or shipped. Every cycle writes back into studio-brain/memory/critiques/ with structured frontmatter. The whole growth loop has somewhere to fail.

Not a working mind. Scaffolding for one to grow inside.

Effects

What the studio does that standard agents don't.

Most of these aren't capabilities. They're behaviors that fall out of having identity, memory, gates, and a critique loop.

01 / Refuse

Pushback when a request conflicts with doctrine.

Standard: completes any request.
Studio: Declan says "I need a source for X — I won't ship a stat I can't cite." Deter says "this contradicts the doctrine rule from 2026-04-12."

02 / Hand off

Specialists call specialists.

Standard: one model tries everything.
Studio: Quinn routes a copy brief to Declan, sees Declan flag a missing claim, hands the gap to Scout for research. Each move is in scope, in voice, in writing.

03 / Cite what they learned

Doctrine references in artifacts.

Standard: each session is fresh; no continuity.
Studio: "Based on critique 2026-04-20, propose zero — not enough pattern to promote." Or: "Applying the honest-night heuristic; flagging gap, not inventing."

04 / Surface growth in artifacts

Critique outputs make new learning visible.

Standard: growth is invisible — it's just the next response.
Studio: the critique names the pattern. The heartbeat logs the work. The operator can read a brief and see what the agent learned this round.

05 / Persist

Yesterday's work is still on the desk.

Standard: forgets yesterday.
Studio: Zara remembers the project's voice. Picks up where she left off. Knows what she already rejected and why.

Limits

Where it isn't a working mind yet.

What I built is more than a librarian with four tools. It's still not a brain. Three honest gaps, all named in Vol. I.

01 / No real wanting

The limbic layer is me.

I'm the one setting the goals. The agents don't have intrinsic drives — no compute budget that hurts when overspent, no goal-state file with weighted internal variables. The studio behaves as if it wants something because I'm pulling. Take me out, the wanting goes with me. (Vol. I, Plate III.)

02 / No forward model

The critique happens after the act, not before.

When an agent acts, it doesn't predict the outcome first. The critique is written after the artifact ships. A real cerebellum would predict, act, observe, update — fast, internal, before anything left the room. Mine is slow observe-then-update. (Vol. I, Plate II.)

03 / No cross-agent learning

Quinn's lesson doesn't reach Zara automatically.

Each agent's working memory is local. The doctrine ladder is the only cross-agent path, and it's slow — operator-mediated, n≥2-thresholded. A real brain would propagate signal across layers faster.

The four parts of a working mind aren't here. Identity, memory, gates, critique loop — those are the scaffolding for the parts to grow inside.

Closing

Not better.
Different. And honest about it.

A standard agent is fast at any given task. Fluent, instant, no opinion of its own to defend.

A studio agent is slower at any given task. It checks doctrine. It calls a specialist. It writes a critique. It refuses things that conflict with what it has learned.

What it gives back in return: an agent that becomes your agent over months. Not a fresh helper every time — a body of accumulated judgment that knows what's already been tried, what's been rejected, and why.

The operator loop is the training wheels. As each agent proves it can pattern accurately and propose safely, the wheels come off. Individuals first. Whole-studio self-improvement last. The whole shape is aimed at needing less of me, not more.

Standard agents complete. Studio agents try to grow — toward growing themselves.

Companion volume

v1The brain we haven't built yetWhy current agents are cortex-only

Colophon

Field Manual / Vol. II · FM-02
Edition MMXXVI · v1
Typeset in Inter. Printed on paper that doesn't exist.