YOUR AI FORGOT SOMEONE'S MEDICATION. NOW WHAT?

I was running a test suite against widemem when I noticed something that made me uncomfortable. A user had mentioned their penicillin allergy in an early conversation. Sixty days later, after dozens of other interactions, the allergy fact had decayed below the retrieval threshold. The system would no longer surface it when asked "what should I know about this user?"

The math was correct. The exponential decay function did exactly what it was supposed to do. But the outcome was wrong in a way that math cannot fix. Forgetting someone's lunch preference is fine. Forgetting their drug allergy is not fine. That is not a "minor regression." That is a phone call from a lawyer.

That test failure led me to build YMYL handling into widemem. This post is about what YMYL means, why it matters more for AI memory than for search engines, and the edge cases that still keep me up at night.


WHAT IS YMYL AND WHY SHOULD YOU CARE

YMYL stands for "Your Money or Your Life." Google coined the term in their Search Quality Evaluator Guidelines to describe content that could directly affect someone's health, financial stability, safety, or legal standing. The idea is simple: not all information carries the same risk. Getting a recipe wrong is annoying. Getting a drug dosage wrong can kill someone.

Google uses YMYL to hold health, financial, and legal content to higher accuracy standards in search rankings. The E-E-A-T framework (Experience, Expertise, Authoritativeness, Trustworthiness) applies extra scrutiny to YMYL pages. A blog post about hiking trails gets evaluated differently than a blog post about managing diabetes.

For search engines, YMYL is a ranking signal. For AI memory systems, it is something more personal. A search engine shows you ten results and you pick one. An AI agent that forgot your medication allergy does not give you that choice. It just acts on incomplete information. Silently. With full confidence. Because that is what LLMs do best.


THE PROBLEM WITH TREATING ALL FACTS EQUALLY

Most memory systems store facts with a timestamp and a vector embedding. Retrieval combines similarity and recency. This means a fact from six months ago scores lower than a fact from last week, regardless of what that fact is about.

For most facts, this is fine. Nobody needs to remember what someone had for lunch six months ago. But some facts do not lose relevance with time:

These facts are not just important. They are the kind of facts where forgetting them has real-world consequences. An AI health assistant that forgets a drug allergy is not just unhelpful. It is dangerous.


HOW WIDEMEM HANDLES THIS

The solution has two parts: classification and protection. Classification figures out whether a fact is YMYL. Protection ensures YMYL facts do not decay or get quietly overwritten.

Two-tier classification

Not every mention of a medical term means someone is sharing critical health information. "Watching Doctor Who" is not a medical emergency. "Walked past the bank" is not a financial disclosure. The classifier needs to tell the difference.

widemem uses a two-tier confidence system. Strong matches are multi-word phrases that are unambiguous: "blood pressure," "bank account," "drug interaction," "child custody." If these phrases appear in a fact, it is almost certainly YMYL content. Weak matches are single words that could go either way: "doctor," "bank," "prescription."

Incoming fact(from extraction)Strong match?"blood pressure""bank account""drug interaction"importance >= 8DECAY IMMUNEnoWeak match?"doctor", "bank""prescription"importance = 6noNormal pathstandard decay

Every fact passes through the YMYL classifier before storage. No LLM call needed. Pure regex.

The two tiers exist because overreacting to ambiguous terms would flood the system with false positives, making the protection meaningless. A single mention of "bank" gets a gentle importance nudge to 6. Two or more weak keywords in the same fact get promoted to strong confidence. "Went to the bank" stays at 6. "Opened a savings account at the bank" goes to 8.

What strong YMYL protection means

When a fact is classified as strong YMYL, three things happen:

The categories

Health7 strong12 weakMedical6 strong13 weakFinancial12 strong16 weakLegal7 strong13 weakSafety7 strong3 weakPharma5 strong5 weakstrong = high confidence, decay-immuneweak = ambiguous, normal decay

Number of regex patterns per YMYL category. Strong patterns trigger full protection. Weak patterns get a gentle nudge.

Eight categories, each with strong and weak regex patterns. The pattern counts are not huge. Health has about 7 strong patterns and 12 weak ones. Financial has the most at 12 strong and 16 weak. The total footprint is small enough that classification adds zero measurable latency. No LLM call, no embedding lookup. Just regex.


THE EDGE CASES THAT WORRY ME

Indirect references

The regex approach catches "diabetes diagnosis" and "insulin dosage" reliably. It does not catch "my sugar levels have been all over the place" or "that pill my doctor switched me to is making me dizzy." These are clearly health-relevant statements. The system treats them as normal facts with normal decay.

A semantic classifier would catch these. But semantic classification means running each incoming fact through an LLM or a fine-tuned model, which adds latency and cost to every single write operation. The regex classifier runs in microseconds. An LLM call takes hundreds of milliseconds at minimum.

One possible middle ground: use regex as a fast first pass, and only call the LLM when the regex is uncertain (one weak hit, no strong hits). This would catch the obvious cases at zero cost and escalate the ambiguous ones. I have not built this yet, but the architecture supports it.

Cultural and language differences

The current patterns are English-centric and US-biased. "401k" and "W-2" are strong financial signals in the US but meaningless elsewhere. Medical terminology varies across countries. "Chemist" means pharmacist in the UK and a science professional everywhere else. "GP" is a doctor in the UK and a Grand Prix in most other contexts.

For widemem to work globally, the YMYL patterns need to be locale-aware. This is straightforward to implement (load patterns based on a locale config) but the pattern libraries for non-English languages do not exist yet. Medical terminology databases exist (SNOMED CT, ICD-10) but converting them to conversational regex patterns is a project in itself.

The false positive problem

Making the YMYL filter too aggressive creates its own issues. If every mention of "doctor," "money," or "court" triggers full YMYL protection, you end up with a memory system where half the facts are decay-immune. The whole point of decay is to let irrelevant facts fade. If nothing fades, you are back to the original problem of an ever-growing memory store where retrieval quality degrades over time.

The two-tier system is my current answer to this. Weak matches get a nudge, not a lockdown. Strong matches get the full treatment. The asymmetry is intentional: I would rather have a false positive (a non-medical mention of "doctor" gets importance 6 instead of 4) than a false negative (an actual medication allergy decays out of the retrieval window). I will take that deal every time. Sorry, Doctor Who fans.

When YMYL facts contradict each other

"I am allergic to penicillin" followed months later by "my doctor says I am not actually allergic to penicillin, the test was a false positive." Both are YMYL. Both trigger strong protection. The conflict resolver needs to handle this correctly, and the stakes are high in both directions: keeping the outdated allergy is overly cautious, removing it based on a misunderstood conversation could be dangerous.

widemem's conflict resolver sends both facts to the LLM with full context. The LLM decides whether to UPDATE or keep both. But I am not fully comfortable relying on LLM judgment for medical fact resolution. This feels like a case where the system should surface the conflict to the application layer and let a human decide, rather than resolving it automatically.


WHAT THE REGULATORS ARE SAYING

This is not just a technical problem. The EU AI Act classifies AI systems used in healthcare, financial services, and legal contexts as high-risk. High-risk systems face mandatory requirements around transparency, human oversight, accuracy, and reliability. An AI agent that forgets critical health information because of a decay function would have a hard time meeting those requirements.

In the US, the FDA has been issuing guidance on AI in healthcare since 2021. Their focus has been on diagnostic AI and clinical decision support, but the principles apply to any system that handles patient information. If your AI agent is used in a telehealth context and it forgets that a patient is on blood thinners, the regulatory exposure is real.

The liability question is still open. If a memory system loses critical health information and a downstream AI agent makes a harmful recommendation, who is responsible? The memory system developer? The agent developer? The platform? The user who entered the information six months ago and assumed the system would remember? There is no clear legal precedent yet, but the question is coming. And lawyers tend to ask these questions loudly.


WHERE I THINK THIS IS GOING

YMYL handling in AI memory is in its early stages. Regex-based classification is a starting point, not an endpoint. Here is what I think needs to happen:

Hybrid classification. Fast regex as a first pass, with LLM escalation for ambiguous cases. This gets you the speed of pattern matching for obvious cases and the nuance of language understanding for edge cases, at a manageable cost.

Application-level conflict surfacing. When two YMYL facts contradict each other, the memory system should not auto-resolve. It should flag the conflict and let the application (or a human) decide. The memory layer should be opinionated about what it protects, but cautious about what it overwrites.

Locale-aware pattern libraries. The community could build and maintain YMYL pattern sets for different languages and legal systems. Medical terminology databases like SNOMED CT could be a starting point for health-related patterns.

Audit requirements. Every YMYL fact modification should be logged with full context: what changed, why, what the old value was, and what triggered the change. widemem already does this through its history system, but it should be a baseline requirement for any memory system that handles safety-critical information.

The tools are getting better. Mem0 and others are building increasingly sophisticated memory systems. But the YMYL problem is not just about better technology. It is about acknowledging that some information carries more weight than other information, and building systems that act accordingly.

Your AI does not need to remember everything. But it absolutely needs to remember the things that matter.


widemem is open source (Apache 2.0) on GitHub and PyPI. YMYL handling is built in and configurable per category.