0.8.6 adds three significant features to the retrieval engine: Reciprocal Rank Fusion (RRF) reranking, memory chain branching, and irregular verb lemma expansion — plus a critical fix to the BM25 document-frequency cutoff logic.
| Benchmark | Metric | 0.8.5 | 0.8.6 | Notes |
|---|---|---|---|---|
| LoCoMo 10-persona | R@10 | 73.0% | 73.0% | Stable; no regression |
| LongMemEval | R@5 | — | 57.6% | First baseline |
| LoCoMo w/ RRF | R@10 | — | 73.0% | Neutral; multi-type +0.5% |
LoCoMo 73.0% on a fresh chain with rebuilt vector sidecar. The previously reported 74.6% was from a different chain state. The 0.8.5 baseline binary also scores 72.9% on that same chain, confirming no regression from 0.8.6 code changes.
Reciprocal Rank Fusion
merges multiple ranked lists by assigning each document a score of
1/(k + rank) from each list and summing them. This is robust to score
scale differences between signals.
When enable_reranking is set on a RankedSearchQuery, the
engine produces three independent rankings — lexical-only, vector-only, and
graph-only — over the top rerank_k candidates, then merges them via
RRF with k=60. Non-rankable signals (importance, confidence, recency,
session cohesion) are added back as small additive adjustments.
RRF is opt-in (default off). On LoCoMo it's neutral overall but improves multi-type queries by +0.5%. It may help more on datasets where lexical and vector signals disagree on top candidates.
New ThoughtRelationKind::BranchesFrom enables cross-chain divergence.
MentisDb::branch_from() creates a new chain with a genesis thought
pointing back to the branch-point on the source chain.
When searching a branch chain, the server transparently searches ancestor chains
(walking BranchesFrom relations) and merges results, annotated with
chain_key so callers know where each hit came from.
// Create a branch from an existing thought
let branch = MentisDb::branch_from(
&chain_dir,
"main-chain",
thought_id,
"experiment-1",
)?;
// Searches on experiment-1 automatically include main-chain results
let results = branch.query_ranked(&query);
Query tokenization now expands irregular English verbs to their base form: "went" → "go", "gave" → "give", "ran" → "run". About 170 mappings are included. The expansion is query-time only — indexed content is not modified.
This helps when a user asks "when did Caroline go to the museum?" and the evidence contains "went to the museum yesterday."
The per-field BM25 document-frequency cutoffs introduced in an earlier commit used per-field DF for the cutoff comparison. Per-field DF is always ≤ global DF, so more terms passed through the filter, adding noise to search results.
The fix: Bm25DfCutoffs now uses global DF
(postings list length) for the cutoff comparison. A term is filtered from a
specific field if its global document frequency exceeds that field's cutoff
ratio. This preserves the original stop-word suppression behavior while still
allowing per-field tuning via the cutoff configuration.
| Config | R@10 | Single | Multi |
|---|---|---|---|
| 0.8.6 (default) | 73.0% | 77.1% | 57.7% |
| 0.8.6 + RRF (k=50) | 73.0% | 76.3% | 58.2% |
| Question Type | R@5 | R@10 | R@20 |
|---|---|---|---|
| Overall | 57.6% | 62.6% | 69.2% |
| single-session-assistant | 83.9% | 85.7% | 89.3% |
| knowledge-update | 82.1% | 88.5% | 89.7% |
| single-session-user | 68.6% | 77.1% | 84.3% |
| temporal-reasoning | 67.7% | 71.4% | 76.7% |
| multi-session | 26.3% | 27.1% | 33.8% |
| single-session-preference | 13.3% | 20.0% | 40.0% |
LongMemEval is significantly harder for mentisdb because multi-session and preference questions require cross-session reasoning that pure retrieval alone cannot solve. Future work on LLM-based reranking (as shown by MemPalace's 100% R@5 with hybrid + Haiku rerank) should close this gap.
cargo install mentisdb --force