0.4 The Search-First Discipline

Read this chapter before you write another line of agent code. It is the most important pattern in this book. It is also the one most teams skip, then regret.

The anti-pattern: "blind appends"

A new agent wakes up. It thinks: "I should record what I just learned so I remember it next time." It appends a Decision thought with a verbose explanation. Done.

The agent repeats this dozens of times per session. The chain grows. Retrieval quality degrades. The agent starts contradicting itself because no one checked whether a Constraint already existed. The user asks "didn't we decide this last week?" and the agent answers "no, I have no record of that" — even though there are three near-duplicate Decision thoughts scattered across the chain.

The fix is one extra step before every append: search for what's already there. If the prior memory is good enough, don't write a duplicate. If it's close, write a Correction or Supersedes relation. If it's a new thought, write it — and link to whatever it's related to.

The three-question rule

Before you append a thought, answer these three questions:

Does this already exist? Run a search. If a thought with similar content exists and is still valid, do not append a duplicate. Reference it instead.
Does this contradict or supersede something? If yes, append with a Supersedes or Corrects relation. Do not silently overwrite.
Is this connected to other thoughts? If it's derived from a prior decision, add a DerivedFrom ref. If it caused an action, add CausedBy. A thought without any backlinks is a dead end in the graph.

The search-before-write loop

Every agent that writes to memory should follow this loop on every turn:

async fn maybe_record(
    chain: &MentisDb,
    agent_id: &str,
    candidate: ThoughtInput,
) -> Result<()> {
    // 1. Search for existing similar memories
    let existing = chain.query_ranked(
        &RankedSearchQuery::new()
            .with_text(&candidate.content_summary())
            .with_limit(5)
            .with_min_score(0.6)
    )?;

    // 2. Decide what to do
    if existing.hits.is_empty() {
        // Truly new — just append
        chain.append_thought(agent_id, candidate)?;
        return Ok(());
    }

    let top = &existing.hits[0];
    if top.score > 0.9 {
        // Near-duplicate — log a LessonLearned instead
        chain.append_thought(agent_id,
            ThoughtInput::new(ThoughtType::LessonLearned,
                format!("I was about to record '{}' but already have \
                         a similar memory: '{}'. Will reference it \
                         instead of duplicating.",
                        candidate.content, top.thought.content)
            ).with_refs(vec![top.thought.index as u32])
             .with_role(ThoughtRole::Retrospective)
        )?;
        return Ok(());
    }

    if is_correction(&candidate, top) {
        // Newer version of an old thought — supersede it
        let mut new_thought = candidate;
        new_thought = new_thought
            .with_refs(vec![top.thought.index as u32]);
        let new_index = chain.append_thought(agent_id, new_thought)?;

        // NOTE: add_relation does not exist. Use .with_relations(vec![ThoughtRelation::new(kind, target_uuid)]) on the ThoughtInput being appended.
        return Ok(());
    }

    // Genuinely new, but related to existing memory
    let mut new_thought = candidate;
    new_thought = new_thought
        .with_concepts(top.thought.concepts.clone())
        .with_refs(vec![top.thought.index as u32]);
    chain.append_thought(agent_id, new_thought)?;
    Ok(())
}

Why this matters

A memory system that grows without bound becomes a junk drawer. Search quality is a function of three things: signal (are the right thoughts findable?), signal-to-noise (are wrong thoughts filtering out?), and graph structure (can the agent follow chains of reasoning?). The search-first discipline maintains all three.

Empirically, agents that follow this discipline:

Produce chains that are 40-60% smaller for the same retrieval quality
Self-correct earlier in multi-step tasks
Have higher user trust because contradictions are visible and resolved
Recover faster after context window resets because the chain is denser, not just longer

Common pitfalls

"I'll just dump everything into the LLM's context"

This works for the first 20 thoughts. By 200, retrieval will dominate latency, context bloat will dilute the prompt, and you'll start seeing the agent ignore relevant memories because they're lost in the noise. The whole point of MentisDB is to make search good enough that you don't need to dump everything.

"I'll let an LLM decide what to write"

LLMs are great at extraction, terrible at deduplication. If you let the LLM append raw extractions, you'll get 50 Insight thoughts that say the same thing in slightly different words. Always pair extraction with the search-first loop.

"I'll write a Cron job that periodically consolidates"

Consolidation is valuable, but it's a complement, not a substitute. By the time consolidation runs, you've already polluted the chain with duplicates, contradictions, and orphaned thoughts. The agent has been giving wrong answers for hours. Search-first prevents the pollution; consolidation cleans up the remainder.

Integration with agents

The skill file (MENTISDB_SKILL.md) distributed with MentisDB bakes this discipline into the agent's operating instructions. When an MCP client loads the skill, it sees:

## Search-First Discipline

Before every append, run this routine:

1. `mentisdb_recent_context` — what did I just do?
2. `mentisdb_ranked_search` with the candidate content as text — is this already there?
3. If yes: don't append. Reference the existing thought.
4. If similar but different: append with a relation (`Supersedes`, `Corrects`, `DerivedFrom`).
5. If new: append with at least one `refs` backlink to the most-related prior thought.

**Anti-pattern: Blind appends.** Writing a thought without searching first
guarantees duplication, contradiction, and retrieval rot.

This makes the discipline automatic for any agent that follows its skill file — which is most of them, if they're properly primed.

What's next

If you are using OpenCode, Codex, Claude Code, Cursor, or another MCP harness, go next to the Operator Playbook. It turns this search-first discipline into daily prompts, timing rules, and checkpoint habits for regular coding-agent sessions.