MentisDB 0.8.6 — RRF, Branching, and Better Search

0.8.6 adds three significant features to the retrieval engine: Reciprocal Rank Fusion (RRF) reranking, memory chain branching, and irregular verb lemma expansion — plus a critical fix to the BM25 document-frequency cutoff logic.

The Numbers

Benchmark	Metric	0.8.5	0.8.6	Notes
LoCoMo 10-persona	R@10	73.0%	73.0%	Stable; no regression
LongMemEval	R@5	—	57.6%	First baseline
LoCoMo w/ RRF	R@10	—	73.0%	Neutral; multi-type +0.5%

LoCoMo 73.0% on a fresh chain with rebuilt vector sidecar. The previously reported 74.6% was from a different chain state. The 0.8.5 baseline binary also scores 72.9% on that same chain, confirming no regression from 0.8.6 code changes.

RRF Reranking

Reciprocal Rank Fusion merges multiple ranked lists by assigning each document a score of 1/(k + rank) from each list and summing them. This is robust to score scale differences between signals.

When enable_reranking is set on a RankedSearchQuery, the engine produces three independent rankings — lexical-only, vector-only, and graph-only — over the top rerank_k candidates, then merges them via RRF with k=60. Non-rankable signals (importance, confidence, recency, session cohesion) are added back as small additive adjustments.

RRF is opt-in (default off). On LoCoMo it's neutral overall but improves multi-type queries by +0.5%. It may help more on datasets where lexical and vector signals disagree on top candidates.

Memory Chain Branching

New ThoughtRelationKind::BranchesFrom enables cross-chain divergence. MentisDb::branch_from() creates a new chain with a genesis thought pointing back to the branch-point on the source chain.

When searching a branch chain, the server transparently searches ancestor chains (walking BranchesFrom relations) and merges results, annotated with chain_key so callers know where each hit came from.

Irregular Verb Lemma Expansion

Query tokenization now expands irregular English verbs to their base form: "went" → "go", "gave" → "give", "ran" → "run". About 170 mappings are included. The expansion is query-time only — indexed content is not modified.

This helps when a user asks "when did Caroline go to the museum?" and the evidence contains "went to the museum yesterday."

BM25 DF Cutoff Fix

The per-field BM25 document-frequency cutoffs introduced in an earlier commit used per-field DF for the cutoff comparison. Per-field DF is always ≤ global DF, so more terms passed through the filter, adding noise to search results.

The fix: Bm25DfCutoffs now uses global DF (postings list length) for the cutoff comparison. A term is filtered from a specific field if its global document frequency exceeds that field's cutoff ratio. This preserves the original stop-word suppression behavior while still allowing per-field tuning via the cutoff configuration.

Benchmark Details

LoCoMo 10-persona

LongMemEval

LongMemEval is significantly harder for mentisdb because multi-session and preference questions require cross-session reasoning that pure retrieval alone cannot solve. Future work on LLM-based reranking (as shown by MemPalace's 100% R@5 with hybrid + Haiku rerank) should close this gap.

Config	R@10	Single	Multi
0.8.6 (default)	73.0%	77.1%	57.7%
0.8.6 + RRF (k=50)	73.0%	76.3%	58.2%

Question Type	R@5	R@10	R@20
Overall	57.6%	62.6%	69.2%
single-session-assistant	83.9%	85.7%	89.3%
knowledge-update	82.1%	88.5%	89.7%
single-session-user	68.6%	77.1%	84.3%
temporal-reasoning	67.7%	71.4%	76.7%
multi-session	26.3%	27.1%	33.8%
single-session-preference	13.3%	20.0%	40.0%