The Agent Memory Landscape, Compared

The core challenge in autonomous AI is no longer context window size; it is state management, temporal lineage, and conflict resolution. As agents execute long-horizon reasoning, they need a memory system that prevents contradictory facts from co-existing while allowing safe, isolated hypothesis testing.

The market has responded with a flood of memory architectures. Nearly all of them leave curation and poisoning risk unmanaged. We hit this wall directly while building production multi-agent systems, which is why we built FAVA Trails, an open-source agent memory layer that treats supersession, draft isolation, and human auditability as first-class concerns.

Here is an objective look at the current ecosystem and the architectural choices behind it.

1. Vector search APIs: the "write-through" problem

Examples: Mem0, Letta, Zep, CrewAI Memory.

Vector databases are the default approach for agent memory. They are frictionless to deploy and excellent for simple retrieval-augmented generation where the agent only needs to look up static facts.

The vulnerability: contextual flattening. Vector databases are inherently flat. If an agent writes a flawed hypothesis on Monday and corrects it on Tuesday, a vector search on Wednesday retrieves both because they are semantically identical. The agent is left with contradictory beliefs. Furthermore, these systems are "write-through" by default: when an LLM decides to store a memory, it becomes canonical instantly. There is no staging buffer and no review process, making the system highly susceptible to memory poisoning.

The other choice. FAVA Trails solves contextual flattening via supersession chains. Old thoughts are tombstoned, not deleted; agents see only the current truth unless they explicitly ask for the lineage. More importantly, it introduces the Trust Gate, a mandatory review layer that ensures no thought enters permanent memory without explicit LLM-critic or human approval.

2. Temporal knowledge graphs

Examples: Graphiti, Cognee.

Knowledge graphs attempt to solve the flattening problem by tracking relationships and temporal awareness. They are powerful tools for understanding how decisions map to outcomes.

The vulnerability: opaque complexity. Graph databases are black boxes to human operators. They are incredibly difficult to audit, rollback, or manually edit without specialized query languages like Cypher. When graph corruption occurs, it is exceptionally hard to untangle.

The other choice. FAVA Trails maintains a graph-like structure (DEPENDS_ON, REVISED_BY) using simple YAML frontmatter inside Markdown files. The graph is a derived projection of the versioned store. If the graph corrupts, it can be instantly rebuilt from the file history. The source of truth remains perfectly human-readable.

3. Structured task trackers & SQL memory

Examples: Beads, Dolt.

The most sophisticated challenge to standard agent memory comes from structured version control, specifically Steve Yegge's Beads framework and Dolt (a version-controlled SQL database). Beads is an exceptional tool for automating product management workflows, managing execution graphs, and ensuring agents know exactly what to do next.

The vulnerability: the database barrier. Systems like Beads and Dolt optimize heavily for the machine, storing agent thoughts and task dependencies in JSONL files or SQL databases. While efficient for task execution, this paradigm fails for institutional memory. Human engineers do not read SQL tables to understand the nuanced rationale behind a complex architectural decision. Forcing semantic project memory into database rows creates a high-friction, opaque barrier between the human team and the AI workforce.

The other choice: familiar Markdown. FAVA Trails rejects the database in favor of the developer's native language. The memory substrate is standard .md files tracked invisibly by Git-native version control.

Human-readable: you don't need a SQL client or a custom CLI, just open it in VS Code, read the plain text, and edit it like any other piece of documentation.
Seamless auditing: file-based version control means every memory change can be reviewed using standard Git diffs.
Semantic vs. task memory: Beads is a PM tool telling the agent what to execute. FAVA Trails is the institutional brain telling the agent why the system works the way it does, with a mandatory Trust Gate to prevent the AI from quietly rewriting your architecture in a SQL table.

Architectural comparison

Each approach makes different trade-offs. The right choice depends on what you're optimizing for.

	Vector search	Knowledge graphs	Task trackers	FAVA Trails
Optimizes for	Retrieval speed	Relationship mapping	Execution orchestration	Memory governance
Write model	Immediate (write-through)	Immediate (edge creation)	Staged (task-level gates)	Draft → Trust Gate → Promote
How corrections work	Overwrite or append	Edge invalidation	Supersedes links	Supersession chains with lineage
Human auditability	Requires vector tooling	Requires Cypher / SPARQL	SQL queries or custom CLI	Markdown files, standard diffs
Agent interface	SDK / REST API	SDK / REST API	CLI + SQL	MCP (config change, not code)
Best suited for	Static fact lookup, RAG	Decision-to-outcome mapping	Work items and dependencies	Learned beliefs and institutional knowledge

The verdict: memory requires governance

The defining feature of enterprise-grade software development is the pull request. We do not let humans push code directly to production without review. Yet the AI industry has spent the last year building memory systems that allow autonomous agents to instantly write unverified facts to global, shared databases.

FAVA Trails (Federated Agents Versioned Audit Trail) is the memory architecture we built on the premise that memory requires governance. By combining the transparency of Markdown, crash-proof Git-native versioning, and the mandatory review pipeline of the Trust Gate, it ensures your agents collaborate on a ground truth that is actually true. The composable context engineering protocols that run on top of it, extractive compression, playbook reranking, and MapReduce orchestration, are covered separately.

Get the memory architecture call right before you ship

Designing the governance layer for agent memory is the kind of architecture decision that compounds: get it right and a multi-agent platform stays coherent at scale; get it wrong and your agents quietly start writing to each other's brains. This is the diligence I bring: the supersession discipline, the trust boundary, and the audit surface that decide whether your system survives real users.

Book a 30-minute call

Stay Updated

Subscribe for frameworks and engagement briefs on production AI, agents, and governed autonomy.

Agentic Memory Landscape