Integration

Why AI Coding Agents Forget Your Project Conventions

April 14, 2026 · 9 min read

Claude CodeIntegrationDogfoodMCP

Five sessions into a project, Claude Code will not remember a rule I wrote on day one. Not “sometimes forgets.” Never remembers. Because it can't.

I keep a file at the root of every project called CLAUDE.md. It has the conventions, the naming rules, the “we decided not to do X because Y” notes, the gotchas. I write to it carefully, and for the first week it works. By week three, Claude Code is suggesting changes that violate three rules I wrote in month one and forgot to keep at the top of the file.

The built-in “memory” isn't broken. It's just a file the agent prepends to the system prompt. That is not the same thing as memory.

The goldfish loop

Here's what actually happens with CLAUDE.md, .cursorrules, and Aider's conventions file. On every new session, the agent reads the file and stuffs it into the system prompt. Your 40-line file costs you maybe 1,200 tokens and works fine. You add more rules. The file hits 200 lines. Then 400. Then 800.

Three things break, in order.

First, the rule you care about is now buried halfway through a wall of text. Transformers do not pay attention evenly across a long context. The rule is technically “in memory” but the model is not reading it when it matters. Anthropic's own research on needle-in-a-haystack benchmarks shows the sag.

Second, you hit hard limits. Claude Code's MEMORY.md has a truncation point around line 200. Claude Code silently drops anything past that. I learned this the hard way when a rule I wrote in January stopped firing in March and I thought I was losing my mind.

Third, the file is static. It learns nothing from what you did yesterday. You corrected Claude three times about the same convention. Claude has no idea. Tomorrow it will suggest the same wrong thing, and you'll correct it a fourth time.

TOKENS PREPENDED PER SESSION AS PROJECT RULES GROW

illustrative: ~6 tokens/line of project conventions. widemem retrieval is query-triggered, not always-on

toy repo

~240 tok

mid, week 1

~1.2k tok

mid, week 4

~3.0k tok

big, month 2

~6.4k tok

big, month 6

~9.8k tok

widemem retrieval

~400 tok

The top five bars grow linearly with your conventions. Every session, every turn, every file Claude touches. The bottom bar is flat because widemem only returns the rules relevant to the current query. That is the shape of the difference.

Context is not memory

I wrote a whole post on this called Why Context Windows Aren't Memory. The short version: a bigger context window gives you a bigger desk, not a better filing cabinet. Real memory survives a cold start, ranks what matters, and gets smarter over time. A file you prepend does none of those things, no matter how big the window gets.

What real memory for a coding agent looks like

When I think about what real memory for a coding agent should do, the list is short.

Persistent. Survives restarts, session resets, and /clear.
Searchable.You don't re-read the whole thing every turn. You pull the relevant facts on demand.
Decaying. “Debugging the auth module this week” is useful this week. Next month it's clutter.
Per-project. The rules for one repo should not leak into another.
Quality-gated.Not every “ok thanks” needs to be stored forever.
Honest.When it doesn't have the answer, it says so instead of making one up.

None of that is exotic. Humans do it without thinking. CLAUDE.md does none of it.

The setup

I built widemem partly because I got tired of losing information between sessions. The Claude Code skill is the first-person version of it.

python

# 1. Install widemem with MCP + local embeddings
pip install "widemem-ai[mcp,sentence-transformers]"

# 2. Install the Claude Code skill
/plugin install widemem

That's it for the happy path. Defaults use Ollama for the LLM and sentence-transformers for embeddings. No API keys, no accounts, no cloud. The skill wires a local MCP server to Claude Code and exposes seven tools: add, search, pin, delete, count, export, and health. You don't call them directly. You talk to Claude normally, and the skill triggers on phrases like “remember this,” “do you remember,” and “what do you know about.”

When you want to script against the store from Python, the SDK is the same six lines it's always been.

python

from widemem import WideMemory

memory = WideMemory()

# Store a rule from your coding session
memory.add(
    "In this repo, integration tests must hit a real Postgres. "
    "Never mock the database. We got burned last quarter when "
    "mocked tests passed but a broken migration shipped to prod.",
    user_id="widemem-ai",
    metadata={"type": "convention", "project": "widemem-ai"},
)

# Later, from a fresh session
results = memory.search(
    "should I mock the database in tests?",
    user_id="widemem-ai",
)
for r in results:
    print(f"[{r.confidence}] {r.memory.content}")

The search returns a confidence level with every result, so Claude knows whether it has the answer or is guessing. When confidence is NONE, the skill tells you instead of hallucinating one up.

Three things it remembered for me this week

Theory is cheap. Here's what widemem actually held across sessions.

1. “Don't mock the database.”

I have an integration-test rule I wrote once, in January, in a conversation that ended with me telling Claude “save this: integration tests must hit the real DB, never mock. Reason: a mocked test passed in CI last quarter and a broken migration shipped to prod.” widemem stored it at importance 9, pinned. The quality gate classifier caught the stakes from the word “never” and the specific past incident.

Three weeks later, new session, I'm adding a test for a new endpoint. Claude starts writing a mock. I ask, “wait, should we be mocking this?” Claude searches widemem, returns the rule with HIGH confidence, and rewrites the test against a real database. The rule was not in CLAUDE.md. It was in memory.

2. “HubSpot token is read-only.”

On one of my projects, the HubSpot API token I use for research has read scopes only. Any POST, PATCH, or DELETE will 401. I've told Claude this across multiple sessions. It used to keep trying.

I pinned it: /mem pin HubSpot token is READ-ONLY. Never POST, PATCH, PUT, or DELETE on HubSpot APIs. widemem stores it at importance 9 as a safety constraint. Now every time Claude is about to hit HubSpot, it checks memory first. The number of times I've watched Claude start to draft a POST, then stop and say “actually, memory says this token is read-only” is now more than zero and climbing.

3. “That underscore rename is wrong in this repo.”

Smaller, and maybe the most satisfying. I was refactoring a Python module and Claude kept suggesting I rename _unused_var to unused_var“for consistency.” I kept saying no, the underscore prefix is how this repo marks intentionally unused locals, don't rename it. I corrected him three times in one session, then hit /clear.

New session, same suggestion. I said “check memory.” Claude searched, found “underscore prefix on unused vars is intentional in this repo, do not rename,” and dropped the suggestion. Didn't bring it up again in that session or the next. That is the thing CLAUDE.md could not do. CLAUDE.mddoesn't know I corrected him. widemem does, because the skill's add tool fires the moment the correction lands.

What about Cursor and Aider?

The skill ships for Claude Code. For Cursor and Aider, you don't get the plugin experience, but you do get the underlying library and an MCP server.

Cursor has MCP support. Point it at widemem's MCP server the same way Claude Code does, and you get the same seven tools. The UX around triggering is rougher because Cursor doesn't have first-class skills, but the storage layer is identical.

Aider is trickier. Aider's “memory” is conventions files passed with --read. What works is a simple pattern: at the start of a session, run a Python script that calls memory.search("current project conventions"), writes the top N results to a temp file, and passes that file to aider with --read. It's crude, but you get the same semantic retrieval benefit without waiting for Aider to grow native MCP support. I'll publish the helper script once I'm sure it's stable.

What's still rough

I don't want to oversell this. Two things to know before you install.

The quality gates are opinionated. The skill has a SKIP tier for things like “ok,” “thanks,” and “run the tests,” and sometimes it skips things I actually wanted stored. If a fact matters, be explicit: use /mem pinor say “remember this, it matters.” The classifier is better at obvious stakes than subtle ones.

Cold-start search is not instant. The first search after a fresh session takes roughly one second while the embedding model warms up. Subsequent searches are sub-100ms. If that bothers you, keep the MCP server running in the background between sessions and it stays hot.

ONE HONEST CAVEAT

widemem does not replace CLAUDE.md. It replaces the part of CLAUDE.md that was trying to be memory and failing. The top five or ten rules you absolutely need the agent to see on every message still belong in CLAUDE.md. Everything else, every convention, every correction, every “we tried this and it broke,” goes in widemem.

Try it

python

pip install "widemem-ai[mcp,sentence-transformers]"
/plugin install widemem

The SDK is at github.com/remete618/widemem-ai. The Claude Code skill is at github.com/remete618/widemem-skill. Both are Apache 2.0.

Five sessions into a project, Claude Code used to forget a rule I wrote on day one. Now it doesn't.