Linked Memory for LLM Agents

7 January 2026 - Hugo O'Connor


I've been burning LLM credits like there is no tomorrow, experimenting and building with LLM agents. Each session, the agent discovers things about my codebase. Patterns. Preferences. Architectural decisions. Then the session ends, and that knowledge evaporates. The next agent starts from zero.

The problem with agent memory

Most LLM memory solutions treat the problem as a vector database query: embed everything, retrieve by similarity. This works for simple recall but misses how knowledge connects.

When I debug an authentication bug, I don't just learn "the bug was in token refresh." I learn that this service talks to that database. That the previous developer preferred a certain pattern. That similar bugs appeared in the payment module last month. Isolated facts are less useful than the relationships between them.

The other problem is trust. If Claude Code stores knowledge, should GPT-4 be able to read it? What about a local LLaMA instance? Memory systems I've seen either give full access or no access. There's no way to say "you can read my research notes but not my private journal."

xm: linked memory with capabilities

xm (Linked Memory) stores knowledge as a graph. Nodes and links, like a personal wiki that agents can query. Built on RDF triples and SPARQL, the technologies from Tim Berners-Lee's Semantic Web vision. That vision imagined software agents exchanging structured knowledge. Twenty years later, we have the agents.

Here's an agent storing what it learned while debugging:

# Store the bug finding
xm node create -t fact -p content="Auth bug was in token refresh" \
  -p source="claude-code" -p discovered="2026-01-07"

# Link it to the affected service
xm link create auth-bug auth-service -t "affects"

# Link it to the module where the fix went
xm link create auth-bug token-module -t "located-in"

Later, a different agent can discover these relationships:

# What do we know about the auth service?
xm query backlinks auth-service

# Returns: auth-bug affects auth-service
#          payment-integration uses auth-service
#          deploy-incident-042 involved auth-service

The backlinks come from Org-roam. When you query what links to a node, you discover context that wasn't explicitly requested.

Here's another example. An agent building a knowledge graph about a codebase:

# Create nodes for components
xm node create -t entity -p name="user-service" -p language="python"
xm node create -t entity -p name="postgres-db" -p type="database"
xm node create -t entity -p name="redis-cache" -p type="cache"

# Create links showing dependencies
xm link create user-service postgres-db -t "reads-from"
xm link create user-service postgres-db -t "writes-to"
xm link create user-service redis-cache -t "caches-in"

# Query: what depends on postgres?
xm query sparql "SELECT ?service WHERE { ?service xm:reads-from ?db . ?db xm:name 'postgres-db' }"

And storing user preferences that persist across sessions:

# Store preferences
xm node create -t preference -p user="hugo" -p key="code-style" \
  -p value="functional, minimal comments, short functions"

xm node create -t preference -p user="hugo" -p key="testing" \
  -p value="prefers integration tests over unit tests"

# Any agent can query these later
xm query nodes --type preference --filter "user=hugo"

Capabilities instead of permissions

The security model comes from Spritely Goblins. Object capabilities instead of access control lists. A capability is a reference that carries its own permissions. If you have the reference, you can use it. If you don't, you can't even ask.

Here's Alice sharing her research with Bob:

# Alice starts her daemon
xm daemon start --listen tcp://0.0.0.0:8555

# Alice has a research graph
xm node create -t source -p title="IPCC Report 2024" -p domain="climate"
xm node create -t source -p title="Nature Paper on Coral" -p domain="climate"
xm node create -t note -p content="Coral bleaching accelerating since 2020"
xm link create "Coral bleaching note" "Nature Paper on Coral" -t "cites"

# Alice creates a read-only capability for Bob
xm cap create --permissions read,query --name "bob-access"
xm cap export bob-access --format sturdyref
# Output: ocapn://tcp-tls/alice.local:8555/s/a3f8c2...

Alice sends that sturdyref to Bob out of band. Email, chat, QR code, whatever. It's a cryptographic reference that works across the network.

# Bob connects from his machine
xm --remote "ocapn://tcp-tls/alice.local:8555/s/a3f8c2..." \
   query nodes --type source

# Bob can read Alice's sources
# source: IPCC Report 2024 (climate)
# source: Nature Paper on Coral (climate)

# Bob queries with SPARQL
xm --remote "ocapn://tcp-tls/alice.local:8555/s/a3f8c2..." \
   query sparql "SELECT ?note ?source WHERE {
     ?n a xm:note ; xm:content ?note ; xm:cites ?s .
     ?s xm:title ?source
   }"

# Bob tries to add his own note to Alice's graph
xm --remote "ocapn://..." node create -t note -p content="My thoughts"
# Error: capability does not permit 'create' operation

Bob can query but not modify. The capability enforces this. Goblins hybridizes actors and the lambda calculus, building on ideas from Jonathan Rees' A Security Kernel Based on the Lambda Calculus. Security analysis becomes reference passing. If you have a reference, you can use it. The Spritely papers explain this better than I can.

If Alice wants to give Bob write access, she can scope it to a specific named graph:

# Alice creates a named graph for collaboration
xm graph create shared-notes

# Alice creates a capability scoped to that graph
xm cap create --permissions read,query,create --graph "shared-notes" --name "bob-collab"
xm cap export bob-collab --format sturdyref

# Bob can now write to shared-notes but not to Alice's other graphs
xm --remote "ocapn://..." node create -t note -p content="My analysis" --graph shared-notes

Named graphs are the security boundary. Each capability is scoped to one or more graphs. Bob's collaboration capability lets him write to shared-notes but he still can't touch Alice's private research graph.

Why a CLI?

Every LLM can run shell commands. Not every LLM can call a Python library or use a REST API without scaffolding.

xm query sparql "SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 10" --json

JSON output means any agent can parse the response. The agent's context window stays focused on the knowledge rather than the mechanics of retrieval.

I built xm in Guile Scheme with a Rust FFI for the SPARQL engine (Oxigraph). The interface is simple on purpose. Agents run commands, get JSON, move on.

What I'm still figuring out

The LoCoMo benchmark tests long-context memory for conversational agents. My current results hover around 60% accuracy using gpt-4o-mini as the query generator. The variance between runs is high because the LLM's query generation isn't deterministic.

The failure modes are interesting. The agent stores "Melanie painted a sunset" but retrieves "nature-inspired artworks" when asked what she painted recently. The fact is there. The query missed it.

Here's what a failed retrieval looks like:

# Stored fact
xm node create -t fact -p subject="Melanie" -p action="painted" -p object="sunset" \
  -p date="2023-07-08"

# Agent generates this query for "What did Melanie paint recently?"
xm query sparql "SELECT ?art WHERE { ?s xm:subject 'Melanie' ; xm:type 'artwork' ; xm:name ?art }"

# Returns nothing because the stored fact uses "action" and "object", not "type" and "name"

This suggests the bottleneck isn't storage but retrieval. Specifically, how agents translate natural questions into graph queries.

I added introspection tools so the agent can discover the schema at runtime:

# What types of nodes exist?
xm schema classes
# Returns: fact (134), event (98), person (2), preference (1)

# What predicates are used?
xm schema predicates
# Returns: content, subject, date, source, ...

The idea was that the agent could learn the schema before querying. In practice, this hasn't helped much yet. The agent still generates queries with the wrong property names even after seeing the schema. More work needed.

I'm also exploring whether the agent should store knowledge differently. Self-contained statements retrieve better:

# Instead of fragmented properties
xm node create -t fact -p subject="Melanie" -p action="painted" -p object="sunset"

# Store as a complete statement
xm node create -t fact -p content="Melanie painted a sunset on 8 July 2023"

The second form is less structured but more reliably retrieved. I'm not sure which approach is better yet.

On the performance side, I suspect I could improve results another 30% with tuning. Better prompts for query generation, smarter retrieval strategies, maybe a more capable model than gpt-4o-mini. That would put it on par with many much more complex architectural approaches. The architecture isn't the bottleneck right now.

Try it

xm is a research prototype, open source on Codeberg under AGPL-3.0.

git clone https://codeberg.org/anuna/xm.git
cd xm
cargo build --release
export PATH="$PWD/bin:$PATH"
xm --help

I'm trying to answer a question: what would it look like for agents to have memory that persists, connects, and shares? Not a context window. Not a vector store. Something closer to how people accumulate and exchange knowledge. Linked. Partial. Mediated by trust.

I haven't figured it out yet.

--

This post was written with LLM assistance.

Go back