199 Lessons the Brain Couldn't Remember

Last Tuesday I watched our AI memory system recommend the same bad approach twice in one session. Sync persist. We'd tried it three months ago. It was slow, CPU-bound embedding that locked the event loop. The brain had recorded the lesson. "Sync persist is too slow. Use async." It was right there in the database.

The system recommended it anyway.

The setup

We run a memory system called Aianna. It's the knowledge layer for our agent ecosystem, FORGE. Every time an agent learns something the hard way, that lesson gets embedded and stored in a vector database. The idea is simple: before any agent acts, it checks the brain. What do we already know about this? What went wrong last time?

It's a good idea. We had 199 lessons stored. Patterns about deployment failures, schema mismatches, API quirks, infrastructure traps. Real operational knowledge earned through real mistakes.

The primary recall function, query_memory, is the front door. Agents call it before every significant decision. It searches the vector store, optionally boosts results using a knowledge graph, and returns the most relevant context.

199 lessons. Zero recall. Every query came back empty on lessons.

The bug

Here's the line. store.py, line 28:

COLLECTIONS = ["conversations", "decisions", "crystals", "meetings"]

That's the list of vector collections the search function iterates over. When query_memory calls store.search() without specifying a collection, it searches those four. Conversations, decisions, crystals, meetings.

Not lessons.

The lessons collection existed. It had data. It was healthy. The system could write to it just fine. There was even a dedicated query_lessons function that searched it directly. But the primary recall path, the one every agent hits before every decision, never looked there.

Seven characters. The string "lessons" was never added to the array.

There was a second bug stacked on top. The graph-boosted search path, which re-ranks results using knowledge graph connections, couldn't boost lesson results because they were never in the initial vector results to begin with. No candidates in, no boost out. The graph layer was working correctly. It just had nothing to work with.

Why this matters

The conversation around AI failure modes is dominated by the exotic stuff. Hallucination. Model collapse. Reward hacking. Alignment drift. Those are real concerns. But when I look at what actually breaks in production AI systems, it's almost never that.

It's a missing string in an array. It's a schema mismatch where lessons store problem and correct_approach but the search function expects a content field. It's a config file that didn't get updated when a new collection was added six weeks ago.

These are the same bugs we've been writing since the first database connection string got hardcoded. Off-by-one errors. Incomplete migrations. Code paths that diverge because two functions were written three weeks apart and nobody connected them.

The AI part of Aianna worked. Embeddings were clean. Vector similarity was accurate. Graph relationships were correct. The retrieval pipeline was sound. What failed was a Python list that should have had five items and had four.

The fix and the lesson

The fix itself is trivial. Add the string. Handle the schema difference so lessons surface alongside conversations and decisions. Test that query_memory returns results from all five collections.

The lesson is less trivial.

When you build AI systems, most of your debugging time will not be spent on AI problems. It will be spent on plumbing. Connection pools. Collection names. Payload schemas. The unglamorous connective tissue between the smart parts.

Our brain had the knowledge. It had the lesson about sync persist being too slow. It had 198 other lessons, each one earned through some failure or discovery. The intelligence was there. The recall mechanism had a hole in it.

If you're building AI memory, or evaluating someone else's, ask the boring questions first. Does every collection get searched? Do the schemas match? Is there a test that writes a record and reads it back through the primary path? You'll find more real bugs in the plumbing than you will in the model.

199 lessons, stored correctly, indexed correctly, invisible to the system that needed them. Not because the AI failed. Because a developer, building fast, forgot to update an array.

The oldest bug in software, wearing a new hat.