April 8, 2026

MentisDB 0.8.0 — Smarter Search, Faster Writes, Better Security

MentisDB 0.8.0 is the biggest search quality release we've ever shipped. We took our LongMemEval score from 57.2% to 65.0% recall, hardened the security model, made writes 13.8% faster, and cleaned up the skill file that agents actually read at startup. Here's what changed and why.

Search That Finds What You Need

The headline number: 65.0% R@5 on LongMemEval — the standard benchmark for long-term memory retrieval. That means two out of three times, the correct memory surfaces in the first five results your agent sees. We started this release at 57.2%.

Three changes got us there. None of them required changing the storage format, re-indexing manually, or breaking the API.

1. Porter Stemming (+4.4%)

The single biggest improvement. Our lexical tokenizer now stems every token before indexing and querying using the Porter stemming algorithm. "prefers", "preferred", and "preferences" all map to "prefer" — so a query for "food preferences" now matches a memory that says "I prefer Thai cuisine."

This one change took overall R@5 from 57.2% to 61.6%. Temporal reasoning improved +9.0% and user facts +8.5% — categories where queries often use different word forms than the original evidence.

Stemming is automatic. Your existing chains rebuild their lexical index on first access after upgrading. No configuration needed.

2. Tiered Vector-Lexical Fusion (+3.4%)

The old scoring just added BM25 and vector scores together. The problem: vector scores range from 0–0.35 while BM25 goes from 0–30+. Semantic-only matches — thoughts with zero word overlap but strong meaning similarity — never surfaced because their vector signal was drowned out.

We replaced flat addition with a tiered boost:

Lexical Match	Vector Treatment
No match at all	60× boost — semantic hits compete with BM25
Weak match (< 1.0)	20× ramp — partial credit for both signals
Strong match	Additive — vector nudges, doesn't disrupt

We also tried Reciprocal Rank Fusion (RRF), the standard hybrid search approach. It hurt. RRF demoted strong BM25 hits by giving vector matches equal weight. For memory retrieval where keyword search is already strong, re-ranking by vector similarity makes things worse, not better. Tiered boost preserves what works and only uses vectors to promote what BM25 can't find.

3. Importance Weighting

User-originated thoughts carry importance ≈ 0.8; verbose assistant responses carry importance ≈ 0.2. Previously, the importance weight in scoring was 0.2× — essentially noise. We raised it to 3.0×.

Now user thoughts get +2.4 vs assistant +0.6. When BM25 scores are close (which they often are for preference and factual queries), the importance signal tips the race toward the memory the user actually said, not the assistant's paraphrase.

Faster Writes

Append latency dropped 13.8% (statistically significant, p=0.01) from three hot-path improvements:

Zero-copy record assembly — encode thoughts directly into a length-prefixed buffer with one allocation instead of building an intermediate copy.
Bincode hashing — thought hash computation now uses bincode instead of JSON serialization, eliminating a full JSON encode per append (bincode is 3–5× faster for structs).
Configurable group commit — the new MENTISDB_GROUP_COMMIT_MS env var (default 2ms) lets you tune the write-batching window. Set to 0 for lowest latency, higher for bulk-load throughput.

The background writer also got double the queue capacity (128 slots) and pre-allocated buffers, reducing backpressure stalls under concurrent multi-agent writes.

Local Embeddings with FastEmbed

The local-embeddings feature flag now uses FastEmbed all-MiniLM-L6-v2 (384 dimensions, ONNX runtime) instead of the previous local-text-v1 provider. It runs entirely on your machine with no cloud dependencies and no GPU required.

The daemon auto-detects and registers FastEmbed for search when the feature is compiled in. Vector sidecars rebuild automatically on first access after upgrading.

Security Hardening

Several changes close attack surfaces without affecting normal usage:

SkillReadOutput wrapper — read_skill() now returns SkillReadOutput { content, warnings, status } instead of a plain string. Callers can no longer silently ignore skill revocation status or safety warnings.
1MB skill upload limit — prevents unbounded uploads from exhausting server memory.
10MB thought payload limit — down from 64MB, prevents DoS via large blob uploads.
Dashboard rate limiting — 5 failed login attempts per IP within 5 minutes triggers HTTP 429.
Constant-time PIN comparison — uses subtle::ConstantTimeEq to prevent timing attacks on the dashboard login.

The Skill File Got 8× Shorter

The MENTISDB_SKILL.md — the operating instructions agents read at startup — went from 68,270 characters across 86 sections down to ~8,000 characters.

Why? Because agents were skimming it. When you hand an LLM a 68K document and tell it to follow the rules inside, it reads the first few sections and misses the operational constraints buried at line 280.

The rewrite:

puts the mandatory startup protocol as the first section
consolidates scattered write-trigger definitions into one table
lists all 30 thought types (was missing 14)
adds back-referencing and sub-agent orchestration rules
replaces prose anti-patterns with a concise bullet list

New: Goal Thought Type

Goal captures high-level objectives — broader than Plan (which describes how) and Subgoal (which is a component). Use it to record what the agent is trying to achieve so future sessions can orient quickly even if plan details have changed.

That makes 30 semantic thought types total. The new variant is appended at the end of the enum (bincode-safe — no reindexing needed).

Chain Merging

New REST endpoint POST /v1/chains/merge and MCP tool mentisdb_merge_chains let you merge all thoughts from a source chain into a target chain, then permanently delete the source. Agent identities are remapped automatically by similarity matching.

Other Improvements

Claude Desktop bridge fix — the mcp-remote bridge now uses the explicit node path as the command, bypassing shebang resolution issues on systems with multiple Node versions. Node ≥ 20 is required and validated at setup time.
JSONL storage removed — binary is now the only format for new chains. Existing JSONL chains remain readable for migration.
ManagedSidecarEntry with auto_sync — vector sidecars can be registered for search-only use (no per-append ONNX cost during bulk ingestion). FastEmbed no longer triggers a full rebuild on startup.
MIN_VECTOR_COSINE lowered — from 0.12 to 0.06, catching more short-evidence semantic matches.
Code quality — extracted PlatformPaths, HasOptionalQueryFields trait, and read_length_prefixed_thoughts helper, eliminating ~260 lines of duplication.

Upgrade

cargo install mentisdb

Or from source:

git pull
cargo install --path . --locked

Existing chains, vector sidecars, and skill registries are migrated automatically on first startup. No manual steps required.

What's Next

Single-session-preference queries (13.3% R@5) are the clearest target. These require bridging the semantic gap between implicit evidence ("I've been really into Thai cuisine lately") and preference queries ("What kind of food do I like?"). Better embeddings or a reranking step are the most likely path forward.

Multi-session recall (59.4%) also has room to grow. Graph expansion currently contributes 0.0 on misses — the traversal finds related thoughts but not the specific evidence buried in long conversations.

MentisDB is an open-source durable memory layer for AI agents. It stores memories in an append-only hash-chained log, retrieves them with hybrid lexical+semantic+graph search, and runs entirely locally with no cloud dependencies. GitHub · Docs · Website