Eviction & TTL
Entries can leave the cache through four distinct paths. All of them write durable WAL tombstones, so a restarted node never re-materialises a removed entry.
| Path | Trigger |
|---|---|
| LRU eviction (capacity) | max_entries_per_namespace cap; flush task evicts after each insert batch |
| TTL expiry (age) | Per-entry ttl_seconds; reaper sweeps every expire_scan_interval_secs |
| Explicit deletion (UUID) | DELETE /entry/:uuid; cluster fan-out to all live peers |
| Semantic invalidation (radius) | POST /admin/invalidate {embedding, threshold, ...} |
LRU eviction
Set max_entries_per_namespace to cap a namespace's size:
After each insert batch, the flush task checks each touched namespace. If it's over the cap, the least-recently-accessed entries are evicted until the namespace fits.
Tie-break order: (last_accessed_at, inserted_at, internal_id) — deterministic FIFO when access timestamps are identical.
Per-namespace, not global. Each scoped namespace (tenant, conversation) gets its own cap — a noisy tenant cannot evict a quiet tenant's entries.
Access metadata is soft state
last_accessed_at and access_count are not WAL-durable. They live in memory only and rebuild from traffic after a restart. Writing them to the WAL on every hit would defeat the cache (one fsync per read).
HNSW lazy deletion (ghosts)
HNSW has no native deletion API. When you evict an entry:
- Its UUID is removed from the side-table.
- Its internal HNSW id goes into a per-namespace
evicted_ids: HashSet<usize>(a "ghost" set). - A tombstone is written to the WAL.
Queries oversample by 1 + min(ghosts, 8) and filter ghost IDs out before applying the threshold.
When ghost_ratio = ghosts / total > 0.20, the namespace is rebuilt from scratch on the next flush. Rebuild is synchronous and runs under the index write lock — see the benchmark page for the throughput trade-off.
TTL expiry
Set ttl_seconds on insert:
client.insert(
embedding=emb,
response=ans,
query_text=q,
model_id="...",
ttl_seconds=3600, # 1 hour
)
Inline expiry on read. The query path checks expires_at and treats expired entries as misses without mutating state. The actual cleanup happens in the reaper.
Reaper loop. Every expire_scan_interval_secs (default 60s):
collect_expired— scan every namespace for entries pastexpires_at.- Write tombstones to the WAL (one per expired entry).
- After the WAL flushes,
replay_entryremoves the entries from the in-memory index. rebuild_dirty_namespacesrebuilds any namespace whose ghost ratio crossed 20%.prune_empty_namespacesdrops conversation namespaces (those with::conv_in the name) that are now empty. Base namespaces survive empty.
Auto-TTL for conversations
When a conversation-scoped insert (conversation_id=...) has no explicit ttl_seconds, FerroCache stamps the auto-TTL on it. Explicit ttl_seconds always wins.
Without auto-TTL, abandoned conversation namespaces would linger forever. Set this for any conversation-scoping deployment.
Explicit deletion
In cluster mode, the request fans out to every live peer. 404 from a peer is treated as idempotent success — if the entry never existed there, deleting it is a no-op.
The reverse uuid → namespace map handles namespace resolution server-side, so you don't need to pass model_id or cache_scope.
Semantic invalidation (radius delete)
Delete every entry within a similarity radius of a target embedding:
curl -X POST http://localhost:3000/admin/invalidate \
-H 'Content-Type: application/json' \
-d '{
"embedding": [/* ... */],
"threshold": 0.95,
"model_id": "all-MiniLM-L6-v2::384",
"cache_scope": "tenant_abc"
}'
# {"deleted_count": 7}
Use cases:
- A factual answer changed — invalidate everything semantically close to "what is the refund policy?".
- A document was retracted — invalidate everything semantically close to its embedding.
"Compute, not copy." In cluster mode, each replica computes its own match set against the same (embedding, threshold). No UUID list ships across the wire — replicas might temporarily diverge on which entries match (e.g., due to ghost-set drift), but the next read repair healing pass fixes that.
Durability guarantee
Every removal path writes a WAL tombstone before mutating the index. On restart:
replay_entrybranches onentry.tombstone.- Tombstones call
remove_by_uuidinstead of inserting. - Phantom entries can never come back — even if the original insert and its tombstone are both in the WAL, replay processes them in order and ends with the correct state.
This is the M29 fix: previously the runtime tombstone path (reaper → WAL → flush → replay_entry) was re-materialising phantom entries because replay_entry had no tombstone branch. Now the startup and runtime paths share one code path.
Observability
| Metric | Meaning |
|---|---|
ferrocache_evictions_total | LRU evictions |
ferrocache_expirations_total | TTL expiries |
ferrocache_deletions_total | Explicit DELETE fan-outs |
ferrocache_invalidations_total | Radius-invalidate counts |
ferrocache_index_rebuilds_total | HNSW rebuilds (ghost ratio threshold) |
All five exist top-level and per-namespace.