0.5 What We Learned Writing This Book
/tmp/cookbook-memory.md stand-in file. That story is
historical; the cookbook no longer works that way. The maintenance loop
described below is the one the cookbook actually uses today.
The setup
The cookbook is maintained like any long-running MentisDB-backed project. At the start of a session, the agent opens the project chain, loads the MentisDB skill, reads recent context, and searches for cookbook-related decisions, constraints, mistakes, and prior verification results. Only then does it edit files.
That sounds mundane, but it changes the work. The agent does not need the user to paste old context back into the prompt. It can find prior lessons like "do not give users API-shaped memory prompts," "cookbook examples must be verified," and "public docs should teach the operator loop before the Rust API." The session starts with the project's memory, not a blank slate.
The prompt that keeps the book honest
When you ask an agent to update the cookbook, prompt it like this:
Use MentisDB while editing the cookbook.
Before changing docs, search for prior cookbook decisions, style constraints, known mistakes, and verification notes. Tell me what matters.
While editing, do not teach users to speak in API payloads unless the chapter is explicitly for API developers. For operator-facing sections, write natural prompts a human would actually say.
Before you stop, save a checkpoint with what changed, what you verified, remaining risks, and the exact next step.
Notice what is not in that prompt: no fake JSON, no hand-written tag arrays, no pretending the user has to know thought internals. The user asks for the behavior. The agent chooses the right MentisDB calls.
Lesson 1: The chain carries editorial decisions
Documentation has product decisions embedded in it. For this cookbook, one of the important decisions is that operator-facing chapters should teach people how to ask for memory behavior in normal language. Users should be able to say, "remember the lesson from this bug," not assemble a typed append request by hand.
Save decisions like that in MentisDB. Future agents can search for "operator prompt style" or "API-shaped prompts" and avoid reintroducing the same mistake in another chapter.
Lesson 2: Search before editing beats archaeology after editing
The expensive failure mode in docs is not writing bad prose once. It is spreading the same wrong assumption across ten pages. Before changing a cookbook chapter, search for the chapter name, the subsystem, and related mistakes. If the current work touches examples, search for API drift and previous compile failures. If it touches operator guidance, search for user feedback about tone and usability.
This is where MentisDB earns its keep: the useful context is not always in nearby files. It may be in a previous release note, a failed verification run, a user complaint, or a checkpoint from another agent.
Lesson 3: Save human lessons, not transcript sludge
The cookbook does not need a memory for every file read or every command attempted. It needs durable lessons that change future behavior. Good cookbook memories sound like project guidance:
- Operator-facing docs should use natural prompts, not raw MCP payloads.
- Examples that claim to compile need cookbook-as-test coverage or an explicit skip marker.
- When a chapter mentions a generated artifact, verify the generator or the target file before claiming it exists.
- Checkpoint before handing off so the next agent knows what changed and what still worries you.
The agent can store those as typed MentisDB thoughts behind the scenes. The user should not have to care whether the exact type is a decision, lesson, correction, or checkpoint unless they are building directly against the API.
Lesson 4: Cookbook examples are executable claims
Every Rust code block that teaches the public API is a claim about what the library does. MentisDB memory can remember that examples drifted before, but memory is not a substitute for verification. The docs pipeline has to compile examples continuously.
The implementation lives in scripts/extract_cookbook_examples.py
and the resulting tests/cookbook/<chapter>.rs files. The
extractor reads every <pre> block in
docs/cookbook/, skips blocks marked
data-cookbook-test="off", and emits one
chapter_compiles tripwire per chapter that uses Rust. If the
chapter references a method, type, or import that no longer exists in the
public API, the file fails to compile and the test fails.
The first run caught 174 errors
On the first run after this pipeline shipped, the test target failed to compile with 174 errors across the cookbook chapters. The categories, in order of frequency:
- Stale API references:
add_relation()(doesn't exist),with_enable_reranking(true)(renamed towith_reranking(k)),register_managed_vector_sidecar(renamed tomanage_vector_sidecar),GraphExpansion::new()(renamed toRankedSearchGraph::new()). - Invented helper types:
EpisodeRecorder,HandoffEnvelope,TagWeightedEmbeddingProvider,test_chain(),is_correction(). Sub-agents wrote structurally-sound code referencing helpers that do not exist in the library. - Type mismatches:
?onRankedSearchResult(it's not aResult),with_refstakingu32(it wantsu64),RankedSearchScorenot implementingDisplay(useDebug). - Decoder bugs:
decode_blockwas strippingResult<()>as if it were an HTML tag, because the unescape happened before the tag-strip regex.
That was the wave 1 fallout. The cookbook has since been rewritten across several passes, and the current state of the pipeline is the one you can reproduce today:
# Regenerate the cookbook tests and run them locally.
python3 scripts/extract_cookbook_examples.py docs/cookbook/
cargo test --test cookbook_tests --all-features
As of the June 2026 pass, the extractor emits 12 compile-tested
chapters with 17 Rust examples and cargo test
--test cookbook_tests passes 12/12. Two chapters are deliberately
language-agnostic and have no generated test file: the new
0.3 Quickstart and
1.0 Operator Playbook.
They are correct, complete, and intentionally not testable as Rust. The
remaining chapters either compile (Part 1 patterns, Part 3 hardening,
Part 5 recipes) or are marked data-cookbook-test="off" when
they show shell, JSON, or operator prose.
Lesson 5: Three sub-agent failure modes the chain has actually recorded
The cookbook's value is partly that it has been built and rebuilt the hard way. The chain has memory checkpoints for each pass. Three failure modes show up over and over, and each one teaches the same thing in a different shape: sub-agents do not fail loudly; they fail quietly with output that looks plausible.
-
Stub chapters. A sub-agent dispatched to "write chapter
1.4" returned a file containing only
<html>Hello</html>. Another returned<!DOCTYPE html>with no body. The TypeScript-like-looking files were named without the project'scookbook-N-M-…htmlconvention. The chain checkpoint for that recovery documents the conventional renames and the conventional filenames the next sub-agent must use. Save naming and structure conventions as memories, not as a one-time prompt. -
Invented API surface. Sub-agents that did write real
chapters often wrote real-looking Rust that referenced methods, types,
and helpers that do not exist in the public crate. The
cookbook-as-test pipeline catches that, but only if the chapter is
actually marked for the extractor. A chapter with
data-cookbook-test="off"on every block compiles nothing and proves nothing. Mark illustrative blocks off, never whole chapters that could be tested. -
Right work, wrong audience. A sub-agent can be right
about the API and still be wrong about the user. The original 1.0
Operator Playbook taught people to hand-write
Append a LessonLearned memory: content/tags/concepts/refspayloads. That is correct Rust, but it is not how a human using a chat coding agent actually talks. The same sub-agent skill that produces good Rust produces bad human-facing prose unless the audience is in the prompt and the chain. Search first means searching for prior audience decisions too, not only for prior technical decisions.
Lesson 6: The verification ritual before you stop
Every meaningful cookbook edit ends with the same five commands. If any one of them fails, the edit is not finished. Future agents search the chain for the phrase "verification ritual" and find this list.
git diff --check -- docs/cookbook/
python3 scripts/extract_cookbook_examples.py docs/cookbook/
cargo test --test cookbook_tests --all-features
# Internal link sanity:
python3 -c "import re,sys,pathlib,urllib.parse as u; \
bad=[]; \
[bad.append((p.name, m)) for p in pathlib.Path('docs/cookbook').glob('*.html') \
for m in re.findall(r'href=\"([^\"#]+)\"', p.read_text()) \
if m.startswith('cookbook-') and not (pathlib.Path('docs/cookbook')/u.unquote(m)).exists()]; \
sys.exit(1) if bad else print('links ok')"
# Save a MentisDB checkpoint in plain language:
# "Save a Summary checkpoint on the cookbook chain summarizing what
# changed, what passed, what failed, and the exact next step."
The fifth step is the one most agents skip. Saving a MentisDB
Checkpoint with what changed, what passed, what failed, and
the exact next step is what makes tomorrow's session start smarter than
today's. Without it, the next session re-derives the same context from
scratch, or worse, does not know that this work was already done. Use the
same natural-language prompts the operator chapter teaches, not a raw
mentisdb_append invocation.
Lesson 7: Live, dated, narrow wins beat ancient sweeping claims
Earlier versions of this chapter listed "seven lessons" from the wave 1 markdown-file stand-in. Those lessons were real but they drifted. The chapter is at its best when each numbered lesson is dated, narrow, and anchored to a checkpoint on the chain that a future agent can actually retrieve. "Use a chain" is true and useless. "On 2026-06-10, the operator playbook was rewritten to teach natural language prompts; see checkpoint #587." is true and useful.
When you add a lesson to this chapter, write it the way you would want to find it six months from now: a concrete date, a specific chapter or commit, and a checkpoint index. Otherwise, prune it.
What's next
Part 1 continues with 1.1 Episodic Task Memory, which is the same operator loop (search first, work, save the durable parts, checkpoint) applied to a single agent's task memory. If you skipped to this chapter to read the meta-lesson first, go back to 0.1 and read in order.