April 14, 2026

MentisDB 0.9.1 — The 0.9.x Journey: From Competitive Analysis to 74% R@10

Four days ago we published a competitive analysis and discovered MentisDB was missing temporal facts, memory dedup, a CLI, webhooks, federated search, and an official Python client. Today, after 11 releases (0.8.2 → 0.9.1), all of those gaps are closed and we've run our first full 10-persona LoCoMo benchmark: 74.0% R@10 on 1,977 queries.

This post is the full story: what we found, what we shipped, how we compare to the competitive field today, and what the benchmark tells us about where to improve next.

The Starting Point: April 10 Competitive Analysis

On April 10, we published a detailed competitive analysis of six agentic memory systems. The conclusion was honest: MentisDB had unique strengths (Rust, embedded storage, cryptographic hash chain, no-LLM-required core) but was missing most of the features users expected — temporal facts, memory dedup, multi-level scopes, custom ontology, episode provenance, and an MCP server.

The competitive landscape we surveyed:

System	Language	Storage	LLM Required	Local-First	Crypto Integrity	Hybrid Retrieval
MentisDB	Rust	Embedded (sled)	No (opt-in)	Yes	Hash chain	BM25+vec+graph
Mem0	Python	External DB	Yes	Self-host option	No	vec+keyword
Graphiti/Zep	Python	External DB	Yes	Self-host only	No	semantic+kw+graph
Letta/MemGPT	Python/TS	External DB	Yes	Self-host option	No	No
Neo4j LLM KB	Python	Neo4j	Yes	No	No	Multi-mode
Cognee	Python	External DB	Yes	Partial	No	vec+graph

The analysis identified six major gaps we needed to close before 1.0. We set a roadmap targeting 0.8.2 for temporal facts, dedup, and CLI; 0.9.0 for ecosystem features. We shipped all of it — and more.

What We Shipped: 0.8.2 → 0.9.1

In 11 releases, we closed every feature gap identified in the April 10 analysis:

Feature	Version	Description
Temporal Facts	0.8.2	`valid_at` / `invalid_at` on thoughts; `as_of` query parameter for point-in-time retrieval
Memory Dedup	0.8.2	Jaccard similarity threshold on append; auto-`Supersedes` relation for near-duplicate thoughts
Multi-Level Scopes	0.8.2	`MemoryScope` enum (`User`, `Session`, `Agent`) on thoughts; scoped search filters
CLI Tool	0.8.2	`mentisdb` CLI: `add`, `search`, `list`, `agents`, `chain` subcommands
Reciprocal Rank Fusion	0.8.6	RRF reranking merges BM25, vector, and graph signals; `--reranking` flag on benchmark
Memory Branching	0.8.6	`BranchesFrom` relation; `POST /v1/chains/branch` creates divergent chains from any checkpoint
Per-Field BM25 DF Cutoffs	0.8.6	Document-frequency-based field weighting improves precision on high-signal fields
Custom Ontology	0.8.7	`entity_type` field on thoughts; per-chain entity type registry; schema validation at API layer
Episode Provenance	0.8.8	`source_episode` field; `DerivedFrom` relation kind; full lineage from derived fact to source
LLM Reranking	0.8.8	Optional cross-encoder reranking on candidate lists; pluggable reranker interface
Federated Cross-Chain Search	0.9.1	`ancestor_chain_keys()` walks `BranchesFrom`; ranked search transparently queries ancestors
Webhooks	0.9.1	HTTP POST callbacks on thought append; `mentisdb_register_webhook` MCP tool + REST endpoint
Opt-in LLM Extraction	0.9.1	GPT-4o (or any OpenAI-compatible endpoint) extracts structured `ThoughtInput` records from raw text; review-before-append workflow
Python Client (pymentisdb)	0.9.1	Full `MentisDbClient` on PyPI; LangChain `MentisDbMemory`; typed enums and relations
Wizard Brew-First Setup	0.9.1	Interactive setup wizard detects Homebrew `mcp-remote` and writes correct Claude Desktop config automatically

LoCoMo Benchmark Results: 74.0% R@10

We ran the full LoCoMo 10-persona benchmark against MentisDB 0.9.1: 1,977 queries across 10 persona × ~197 sessions, with up to 300 conversation turns each, ingested into a single chain with ContinuesFrom relations.

LoCoMo 10-Persona Results (1977 queries)

R@10: 74.0% R@20: 80.8% R@50: 88.5%

Single-hop: 78.0% | Multi-hop: 59.1%

Evaluation: 94 seconds | 20.9 queries/second

By Question Type

Type	R@10	R@20	R@50	Correct / Total
Single-hop	78.0%	84.0%	90.2%	1,212 / 1,554
Multi-hop	59.1%	69.0%	82.0%	250 / 423
Overall	74.0%	80.8%	88.5%	1,462 / 1,977

Near-Miss Analysis

Of the 515 missed queries, 44.3% do not appear in the top-50 results at all — a lexical coverage gap, not a ranking problem:

0.0% of misses appear in top-10 (ranking is working correctly)
26.2% appear in top-20
55.7% appear in top-50
44.3% not in top-50 — lexical gap: relevant content uses different vocabulary than the query

Sample Misses (Single-Hop)

Missed queries consistently show high lexical scores but near-zero vector scores. The lexical matcher finds related content using surface-term overlap, but vector similarity fails to connect semantically related content with different wording:

Q: what did caroline research?
Evidence: "researching adoption agencies — it's been a dream to have a family..."
Retrieved: "that's great news. what did you do?" (lexical=9.9, vector=0.0)

Q: when did caroline have a picnic?
Evidence: "...picnic last week...talked about my transition journey..."
Retrieved: "sounds good, when did you have in mind?" (lexical=9.7, vector=0.0)

This is the core improvement opportunity: closing the multi-hop gap and improving semantic matching when vocabulary diverges between query and memory.

The Competitive Field Today

Since April 10, three significant new entrants have emerged that change the landscape:

Hindsight (`vectorize-io/hindsight`)

9.2k GitHub stars. Server mode with external PostgreSQL, or embedded Python mode. LLM required for core operations (retain/recall/reflect). Claims SOTA on LongMemEval with independently verified scores from Virginia Tech. Four-signal retrieval (semantic, keyword, graph, temporal) merged via RRF + cross-encoder reranking. Unique: Mental Models — reflected higher-order understanding generated by the LLM.

Cognee (`topoteretes/cognee`)

Crossed 15k GitHub stars and shipped v1.0.0 on April 11. Native Hermes Agent integration as memory provider, MCP client package, and Cognee Cloud managed service. Still requires external databases and an LLM for the cognify pipeline.

LangMem (`langchain-ai/langmem`)

1.4k stars. Memory primitives library integrated natively into LangGraph Platform. Pluggable storage backend (in-memory → Postgres). The default in LangGraph Platform deployments gives it massive distribution advantage. LLM required; no graph traversal or temporal facts.

Updated Feature Comparison (April 14)

Feature	MentisDB	Hindsight	Cognee	LangMem	Mem0	Graphiti
Language	Rust	Python	Python	Python	Python	Python
Storage	Embedded (sled)	External (PG)	External	External	External DB	External DB
LLM Required	No (opt-in)	Yes	Yes	Yes	Yes	Yes
Local-First	Yes	No	No	No	Partial	No
Crypto Integrity	Hash chain	No	No	No	No	No
Hybrid Retrieval	BM25+vec+graph	4-signal RRF	vec+graph	vec only	vec+keyword	sem+kw+graph
MCP Server	Built-in	No	MCP client	No	No	Yes
Agent Registry	Yes	No	No	No	No	No
Federated Search	Cross-chain	No	No	No	No	No
Skills/Extensions	Skill registry	No	No	No	No	No
Webhooks	Yes	No	No	No	No	No
Temporal Facts	0.8.2	Via metadata	No	No	Updates	valid_at
Memory Dedup	0.8.2	No	Merge	No	Yes	Merge
Benchmark R@10	74.0%	SOTA (indep.)	N/A	N/A	N/A	N/A

What Makes MentisDB Different Today

The combination of properties that no competitor shares has grown since April 10:

Rust + embedded storage + no-LLM-required + cryptographic hash chain. Still unique. Every competitor requires external databases and/or an LLM for core operations.
Federated cross-chain search (0.9.1). Walk BranchesFrom relations to query parent chains from a branch. No competitor has this.
Skill registry with versioning, deprecation, and revocation (0.9.x). Signed thought support. No competitor has anything comparable.
Webhooks (0.9.1). HTTP POST callbacks on thought append. Fire-and-forget delivery via tokio spawn. No competitor has this.
Opt-in LLM extraction (0.9.1). Keep the no-LLM core for pure structural retrieval; add LLM-powered extraction only when needed. All competitors require LLM for core.
pymentisdb Python client (0.9.1). Full MentisDbClient with LangChain integration, complete enum coverage, typed relations. Enables Python ecosystem adoption.

What We're Still Missing

Honest gaps — what we'd need to win in specific competitive situations:

Gap	Impact	Path to Close
Academic benchmark verification	Hindsight's scores are independently verified by Virginia Tech; ours are self-reported	Partner with an academic group (Sanghani Center at VT or similar)
Native LangChain store	LangMem is the default in LangGraph Platform; massive distribution advantage	Build `langchain-mentisdb` pip package with `BaseStore` implementation
Multi-hop recall (59.1% R@10)	Multi-hop is 19pp behind single-hop; biggest single improvement opportunity	Improve graph traversal depth, add entity coreference, expand `ContinuesFrom` chains
Vector sidecar contribution	Near-zero vector scores on misses; semantic layer not firing in many cases	Debug FastEmbed loading; improve embedding coverage; add query expansion
Managed cloud service	Mem0, Cognee, Hindsight, Fast.io all offer hosted versions	MentisDB Cloud — managed service with zero infrastructure setup
Memory consolidation tiers	agentmemory uses Ebbinghaus decay; Hindsight uses Mental Models reflection	Implement automatic memory consolidation (working → episodic → semantic)

What's Next

The 0.9.x release establishes the feature foundation. The next push is on three fronts:

Benchmark quality: Multi-hop is the primary gap (59.1% vs 78.0%). Improving ContinuesFrom chain density, adding entity coreference resolution, and fixing the vector sidecar contribution should close most of the 19pp gap.
Ecosystem distribution: Native LangChain store, LlamaIndex connector, and explicit Claude Code / Cursor plugins. The skill registry gives us a hook for agent-specific memory capture.
Academic credibility: Independent benchmark verification would put us on equal footing with Hindsight's independently verified claims.

See the ROADMAP.md for the full 1.0 pipeline.

Upgrade Instructions

cargo install mentisdb --force

Or download the binary from GitHub Releases.

pip install pymentisdb --upgrade