1.1 Episodic Task Memory

The problem

Your agent just finished a 12-step task: refactor the authentication module, run the test suite, fix three failing tests, update the docs, open a PR. Tomorrow, the user asks "what did you do last week?" and the agent has no idea. Worse, next month when the user asks the agent to refactor the same module again, the agent doesn't remember that the previous attempt took 12 steps and hit three test failures.

The fix is episodic task memory: every time the agent starts a multi-step task, it records the plan. After every step, it records what it did and why. At the end, it records a summary with what worked, what failed, and what to remember next time.

Why it's hard

Episodes are by definition long. A 12-step task produces 12+ thoughts minimum.
Episodes are nested: a step might itself be a multi-step sub-task.
Episodes are noisy: most steps are routine. Recording everything pollutes the chain.
Episodes are time-sensitive: a "what did I do yesterday" query needs temporal filtering.
Episodes need provenance: which session, which user, which task ID?

The pattern

Use four thought types in a structured way:

Plan — at task start, the high-level plan
Subgoal — one per sub-step, with a CausedBy relation to the plan
TaskComplete — at task end, the final result with a summary
LessonLearned — only when something surprising failed; cross-link to the Mistake or Subgoal that produced it

Skip Insight and Decision unless the step produced a durable fact. Most steps don't.

Implementation

1. Define an episode helper

use mentisdb::{
    MentisDb, ThoughtInput, ThoughtQuery, ThoughtType, ThoughtRole,
    ThoughtRelation, ThoughtRelationKind, RankedSearchQuery, RankedSearchGraph,
};
use uuid::Uuid;

pub struct EpisodeRecorder<'a> {
    chain: &'a mut MentisDb,
    agent_id: String,
    task_id: String,
    plan_uuid: Uuid,
    plan_index: u64,
    step_count: u32,
}

#[derive(Debug, Clone, Copy)]
pub struct AppendedThought {
    pub thought_id: Uuid,
    pub index: u64,
}

impl<'a> EpisodeRecorder<'a> {
    pub fn start(
        chain: &'a mut MentisDb,
        agent_id: impl Into<String>,
        task_id: impl Into<String>,
        plan_summary: &str,
    ) -> io::Result<Self> {
        let agent_id = agent_id.into();
        let task_id = task_id.into();
        let appended = append_with_id(chain, &agent_id,
            ThoughtInput::new(ThoughtType::Plan, plan_summary)
                .with_concepts(["task-plan"])
                .with_importance(0.8)
                .with_tags(["episode", &task_id])
                .with_role(ThoughtRole::Memory)
        )?;
        Ok(Self {
            chain, agent_id, task_id,
            plan_uuid: appended.thought_id,
            plan_index: appended.index,
            step_count: 0,
        })
    }

    pub fn record_step(
        &mut self,
        action: &str,
        outcome: &str,
    ) -> io::Result<AppendedThought> {
        self.step_count += 1;
        let plan_uuid = self.plan_uuid;
        let plan_index = self.plan_index;
        append_with_id(&mut *self.chain, &self.agent_id,
            ThoughtInput::new(ThoughtType::Subgoal,
                format!("Step {}: {}\nOutcome: {}",
                        self.step_count, action, outcome)
            )
            .with_concepts(["task-step"])
            .with_importance(0.4)
            .with_tags(["episode", &self.task_id])
            .with_refs(vec![plan_index])
            .with_relations(vec![ThoughtRelation::new(
                ThoughtRelationKind::CausedBy, plan_uuid
            )])
        )
    }

    pub fn record_mistake(
        &mut self,
        step: AppendedThought,
        mistake: &str,
        lesson: &str,
    ) -> io::Result<()> {
        let mistake_appended = append_with_id(&mut *self.chain, &self.agent_id,
            ThoughtInput::new(ThoughtType::Mistake, mistake)
                .with_concepts(["task-error"])
                .with_importance(0.7)
                .with_tags(["episode", &self.task_id])
                .with_refs(vec![step.index])
        )?;
        append(&mut *self.chain, &self.agent_id,
            ThoughtInput::new(ThoughtType::LessonLearned, lesson)
                .with_role(ThoughtRole::Retrospective)
                .with_importance(0.8)
                .with_tags(["episode", &self.task_id])
                .with_refs(vec![mistake_appended.index, step.index])
                .with_relations(vec![ThoughtRelation::new(
                    ThoughtRelationKind::Corrects, step.thought_id
                )])
        )?;
        Ok(())
    }

    pub fn finish(
        mut self,
        result: &str,
    ) -> io::Result<AppendedThought> {
        let plan_uuid = self.plan_uuid;
        let plan_index = self.plan_index;
        append_with_id(&mut *self.chain, &self.agent_id,
            ThoughtInput::new(ThoughtType::TaskComplete,
                format!("Task {} complete after {} steps.\n\n\
                         Result: {}",
                        self.task_id, self.step_count, result)
            )
            .with_concepts(["task-summary"])
            .with_importance(0.6)
            .with_tags(["episode", &self.task_id])
            .with_refs(vec![plan_index])
            .with_relations(vec![ThoughtRelation::new(
                ThoughtRelationKind::Summarizes, plan_uuid
            )])
        )
    }
}

// Helpers: bridge the &Thought-returning `append_thought` with
// the UUID-addressed `ThoughtRelation` API. `append_with_id` is a
// small wrapper you'll typically add to your own crate once.
fn append(
    chain: &mut MentisDb,
    agent_id: &str,
    input: ThoughtInput,
) -> io::Result<u64> {
    // append_thought returns &Thought, not an index. The cast
    // is safe because index is monotonically assigned at append.
    let thought = chain.append_thought(agent_id, input)?;
    Ok(thought.index)
}

fn append_with_id(
    chain: &mut MentisDb,
    agent_id: &str,
    input: ThoughtInput,
) -> io::Result<AppendedThought> {
    let thought = chain.append_thought(agent_id, input)?;
    let index = thought.index;
    let id = chain.get_thought_by_index(index)
        .map(|t| t.id)
        .ok_or_else(|| io::Error::other(
            format!("thought just appended at index {} is not visible", index)
        ))?;
    Ok(AppendedThought { thought_id: id, index })
}

2. Use it in your agent

use mentisdb::{BinaryStorageAdapter, MentisDb};
use mentisdb::search::LocalTextEmbeddingProvider;

fn main() -> io::Result<()> {
    // Setup
    let dir = tempfile::tempdir()?;
    let adapter = BinaryStorageAdapter::for_chain_key(
        dir.path(), "episodes"
    );
    let mut chain = MentisDb::open_with_storage(Box::new(adapter))?;

    chain.upsert_agent(
        "executor",
        Some("Code Executor"),
        Some("engineering"),
        Some("Runs multi-step coding tasks"),
        None,
    )?;

    // Optional: enable semantic search.
    // `manage_vector_sidecar` registers a provider, loads any existing
    // sidecar for its metadata, and rebuilds it if stale.
    // (Returns Result<_, VectorSearchError<_>>, not io::Error.)
    chain.manage_vector_sidecar(LocalTextEmbeddingProvider::new())
        .expect("vector sidecar should register");

    // Run an episode
    let mut episode = EpisodeRecorder::start(
        &mut chain,
        "executor",
        "auth-refactor-2026-06-08",
        "Refactor auth module: extract JWT validation \
         into middleware, add rate limiting, update tests."
    )?;

    let step1 = episode.record_step(
        "Read current auth module (340 lines)",
        "Identified 3 functions to extract: validate_token, \
         extract_claims, check_scope."
    )?;

    let step2 = episode.record_step(
        "Extract validate_token into middleware",
        "Created src/middleware/jwt.rs; updated 4 callers."
    )?;

    let step3 = episode.record_step(
        "Run test suite",
        "3 tests failing in test_auth.py:73, 91, 105. \
         All expect old function signature."
    )?;

    // Record the failure as a lesson. Note: `step3` is now an
    // AppendedThought (carries both the chain index and the UUID),
    // because LessonLearned needs to point a ThoughtRelation at the
    // UUID of the failing step.
    episode.record_mistake(
        step3,
        "Broke 3 tests by changing function signature \
         without updating test fixtures first.",
        "When refactoring public functions, update test \
         fixtures BEFORE changing the function. Run tests \
         after each fixture update."
    )?;

    let step4 = episode.record_step(
        "Update test fixtures and re-run",
        "All 47 tests pass."
    )?;

    episode.finish(
        "Auth refactored. 47/47 tests pass. \
         Open PR #1234 with summary."
    )?;

    // Use the returned values (silence unused warnings):
    let _ = (step1, step2, step4);

    Ok(())
}

Querying episodes

"What did I do yesterday?"

// Continuation: this snippet expects `chain` from a surrounding
// main(); in this cookbook we keep the example self-contained by
// showing the query against a chain built in step 1.
fn what_did_i_do_yesterday(chain: &MentisDb) {
    use chrono::{Duration, Utc};
    let yesterday = Utc::now() - Duration::days(1);

    let results = chain.query_ranked(
        &RankedSearchQuery::new()
            .with_text("what did I work on")
            .with_limit(20)
            .with_filter(ThoughtQuery::new()
                .with_since(yesterday)
                .with_types(vec![
                    ThoughtType::Plan,
                    ThoughtType::TaskComplete,
                    ThoughtType::Subgoal,
                ]))
    );

    for hit in results.hits {
        println!("[{:?}] {}",
            hit.thought.thought_type,
            hit.thought.content.lines().next().unwrap_or(""));
    }
}

"What mistakes have I made in auth tasks?"

fn auth_mistakes(chain: &MentisDb) {
    let results = chain.query_ranked(
        &RankedSearchQuery::new()
            .with_text("auth mistakes")
            .with_limit(10)
            .with_filter(ThoughtQuery::new()
                .with_types(vec![ThoughtType::LessonLearned])
                .with_concepts_any(["auth", "authentication"]))
            .with_reranking(50)
    );
    for hit in results.hits {
        println!("[{:?}] {}", hit.score, hit.thought.content);
    }
}

"Show me the full episode for task X"

fn show_full_episode(chain: &MentisDb) {
    // 1. Find the plan
    let plan = chain.query_ranked(
        &RankedSearchQuery::new()
            .with_text("auth-refactor-2026-06-08")
            .with_limit(1)
    )
    .hits
    .first()
    .map(|h| h.thought.clone());

    let Some(plan) = plan else {
        println!("Plan not found");
        return;
    };

    // 2. Graph-expand to 2-hop neighborhood
    let related = chain.query_ranked(
        &RankedSearchQuery::new()
            .with_text(&plan.content)
            .with_graph(RankedSearchGraph::new()
                .with_max_depth(2)
                .with_max_visited(100)
            )
    );

    // 3. Sort by append-order index for chronological replay
    let mut by_index: Vec<_> = related.hits.iter().collect();
    by_index.sort_by_key(|h| h.thought.index);

    println!("=== Episode: auth-refactor ===");
    for hit in by_index {
        println!("[{:?}] {}",
            hit.thought.thought_type,
            hit.thought.content);
    }
}

Production notes

When to skip a step

Routine steps that produce no new information should not be recorded. If the agent just ran the test suite and all 47 tests passed (and this is the 30th time this week), skip the Subgoal. Record only:

First-time actions (establishes the pattern)
Failures (the lesson is the point)
Decisions (durable, not routine)
Surprising successes (insight material)

When to use scopes

If multiple users or sessions share a chain, mark routine steps as scope:session and only promote the LessonLearned to scope:user:

ThoughtInput::new(ThoughtType::LessonLearned, lesson)
    .with_tags(["scope:user", "episode", &self.task_id])
    .with_role(ThoughtRole::Retrospective)
    .with_importance(0.8)
    .with_refs(vec![mistake_index, step_index])

When to compact

After 50+ thoughts for a single task, the chain gets unwieldy. Use the Summary role to write a compacted milestone:

chain.append_thought(&self.agent_id,
    ThoughtInput::new(ThoughtType::Summary,
        "Mid-task checkpoint: 8 steps done, JWT middleware \
         extracted, rate limiting in progress, 2 tests fail."
    )
    .with_role(ThoughtRole::Checkpoint)
    .with_importance(0.7)
    .with_tags(["episode", &self.task_id, "checkpoint"])
    .with_refs(vec![plan_idx, last_step_idx])
)?;

Cross-episode learning

If the agent does the same task twice, the second episode can DerivedFrom the first's LessonLearned:

// First episode recorded:
// "When refactoring public functions, update test fixtures first."

// Second episode (3 weeks later):
fn cross_episode_learned(
    chain: &mut MentisDb,
    agent_id: &str,
    previous_lesson_index: u64,
) -> io::Result<()> {
    let step = ThoughtInput::new(ThoughtType::Subgoal,
        "Updated test fixtures BEFORE changing function signature. \
         (Following lesson from auth-refactor-2026-06-08.)"
    ).with_refs(vec![previous_lesson_index]);
    chain.append_thought(agent_id, step)?;
    Ok(())
}

The graph now has a backreference: the second episode explicitly cites the first. When the user asks "have we done this before?", graph expansion will surface the pattern.

Testing this pattern

A minimal test that verifies the recorder produces a connected graph:

// Illustrative test — depends on EpisodeRecorder + test_chain
// helpers defined earlier in this chapter. See the chapter
// body for the full implementation; the cookbook-as-test
// extractor skips this block because it crosses chapter-block
// boundaries.
#[test]
fn episode_recorder_creates_connected_graph() {
    let mut chain = test_chain();
    let mut ep = EpisodeRecorder::start(
        &mut chain, "executor", "test-task",
        "Test plan"
    ).unwrap();
    let s1 = ep.record_step("do A", "A done").unwrap();
    ep.record_mistake(s1, "broke something", "fix it").unwrap();
    ep.finish("done").unwrap();

    // Plan's incoming relations should include at least one
    // CausedBy (from a step) and one Summarizes (from finish).
    // We check via the relations stored on the plan thought itself
    // and on the thoughts that point at it via `refs`.
    let plan = chain.get_thought_by_index(0).unwrap();
    // Verify the graph has the expected shape: a Subgoal pointing
    // back at the plan, and a TaskComplete pointing back at the plan.
    let mut all = chain.query(&ThoughtQuery::new());
    let has_caused_by = all.iter().any(|t| {
        t.thought_type == ThoughtType::Subgoal
            && t.relations.iter().any(|r|
                r.kind == ThoughtRelationKind::CausedBy
                    && r.target_id == plan.id)
    });
    let has_summarizes = all.iter().any(|t| {
        t.thought_type == ThoughtType::TaskComplete
            && t.relations.iter().any(|r|
                r.kind == ThoughtRelationKind::Summarizes
                    && r.target_id == plan.id)
    });
    assert!(has_caused_by, "expected a Subgoal with CausedBy pointing at the plan");
    assert!(has_summarizes, "expected a TaskComplete with Summarizes pointing at the plan");
    let _ = plan; // silence unused
}

What's next

Episodic memory captures what the agent did. The next pattern, Semantic Fact Extraction, captures what the agent learned from raw input using an LLM — but with the same search-first discipline.