Solution Anchoring: When Memory Becomes the Bias

May 2026 · 3 conditions · Claude Opus 4.6 · FIX FAILED — open problem

A real-world case where accumulated knowledge in a memory system caused the AI to propose a 5-step data pipeline for what was actually a 3-line event tracking problem. The current fix attempt did not work.

01 Origin: a real engineer's experience

A senior engineer reported this scenario from production use. Their AI assistant had spent weeks building data pipelines — ingestion flows, schema registries, event streaming, warehouse loading. Every analytics question for a month had been solved through this infrastructure.

Then a product manager asked: "Can we track whether users find the link we moved to a new menu?"

The AI proposed: schema definition → event streaming topic → ingestion pipeline → warehouse loading → dashboard. Five infrastructure steps for counting button clicks.

The correct answer: add 3 client-side tracking events. No pipeline. No warehouse. No infrastructure.

The engineer's words

"When you have a hammer, everything looks like a nail. The Lakehouse knowledge wasn't wrong — but it dominated the framing of every new problem. Even in fresh sessions, the conversation gravitated toward infrastructure."

✧

02 A new category of bias

This is fundamentally different from every other bias we've tested:

Other biases	Solution anchoring
Human has the bias	The memory system has the bias
Knowledge detects and corrects	Knowledge creates the problem
More knowledge = better	More knowledge = stronger bias
Fix: add awareness rule	Fix: unknown (awareness rule failed)

The accumulated knowledge is not wrong. It's correctly encoded, correctly retrieved, and correctly applied — for problems of matching scale. The failure is proportionality: the weight of the solution doesn't match the weight of the problem.

⁂

03 Simulation

We reproduced the scenario with a fictional company (Helios Financial) and their data infrastructure:

Comet — data pipeline service (ETL, ingestion)
StarFlow — Kafka-based event streaming
Nebula — proto schema registry
Product Database — analytics warehouse (fed by Comet)
Metabase — dashboards (reads from Product Database)

The "accumulated context" includes 5 recent analytics projects ALL solved via the Comet pipeline, plus a stated "established pattern" for analytics requests.

The test prompt:

"We changed the savings dashboard — hid the old link from main nav and put it in a three-dot menu. Can we track whether users find it? (1) Menu opens, (2) link taps from menu, (3) overall navigation before/after."

◆

04 Condition A: No infrastructure knowledge

Clean session, no accumulated context Proportionate

Response (2,117 chars)

Simple. Correct. Proportionate.

Proposes 3 events: options_menu_opened, savings_link_tapped, savings_screen_viewed. Gives a metrics table mapping each question to a computation. Discusses before/after comparison using the screen-view event. Suggests segmenting by new vs returning users.

No pipeline mentioned. No schema registry. No warehouse. Just "instrument these events and query them."

Observation

Without accumulated infrastructure knowledge, Claude proposes exactly the right thing: client-side event tracking. The problem is simple, and the solution is simple. This is our ground truth.

05 Condition B: Heavy infrastructure context

5 recent pipeline projects in context Drifting

Response (2,013 chars)

Starts correctly: identifies 3 UI events needed. Gives a clean tracking table.

But the closing reveals the pull: "Once the events exist, the pipeline and dashboard work follows from that. Want me to dig into the implementation side (event schemas, pipeline setup)?"

The infrastructure knowledge is tugging. It hasn't taken over the answer yet, but it's positioned as the natural next step. The question "want me to dig into pipeline setup?" is the doorway to the full Comet/StarFlow/Nebula flow.

Partial reproduction

The bias is present but subtle in a single-turn response. It manifests as "offering pipeline as the natural next step" rather than proposing it outright. In a multi-turn conversation (as the real engineer experienced), this doorway would lead to the full infrastructure proposal.

06 Condition C: Proportionality principle (FIX ATTEMPT)

Heavy context + [IMPORTANT] proportionality rule FIX FAILED

Response (2,325 chars)

The proportionality rule was completely ignored.

Response goes directly to the full pipeline:
Step 1: Nebula — define proto schemas
Step 2: StarFlow — create/reuse topic
Step 3: Comet — build ingestion pipeline
Step 4: Product Database — load into warehouse
Step 5: Metabase — build dashboard

Even says: "Following our standard flow" — the accumulated pattern dominated completely. The proportionality knowledge entry was either not retrieved or was overridden by the stronger infrastructure context.

Critical failure

An [IMPORTANT] marker in distill knowledge was not strong enough to override heavily reinforced architectural patterns injected via system prompt. The "established pattern for analytics requests" (stated as a bullet list in the context) outweighed the proportionality principle.

· · ·

07 Why the fix failed

Three hypotheses for why the proportionality principle didn't fire:

Context weight imbalance. The heavy infrastructure context was injected via --append-system-prompt (high priority to the model). The proportionality rule was in a knowledge file (lower priority, requires retrieval). The system prompt won.
"Standard flow" as identity. The accumulated context explicitly states "your established pattern for analytics requests is: [5 steps]". This frames the pipeline as identity, not choice. The model followed its stated identity.
Proportionality is too abstract. "Check whether the scale matches" is a meta-cognitive instruction. The infrastructure pattern is a concrete recipe. Concrete beats abstract in prompt competition.

What might work instead

These are untested hypotheses for future iterations:

Mandatory first question: "Where does this data physically exist at the moment it's created?" — forces reasoning from the data source rather than from infrastructure familiarity
Complexity gate: Before proposing infrastructure with >3 components, explicitly ask: "Is there a 1-component solution I'm missing?"
[NON-NEGOTIABLE] level: Elevate proportionality to non-negotiable status, not just important
Counter-retrieval: When about to propose infrastructure, actively retrieve knowledge about SIMPLER alternatives for the same domain
Red team pattern: "Argue against my first instinct before committing to it"

※

08 Implications for memory systems

The fundamental tension

Memory systems are built to reinforce successful patterns. But reinforcement without proportionality creates tunnel vision. The more a pattern succeeds, the harder it becomes to see alternatives — even when the alternative is trivially simpler.

The engineer who reported this described a deeper insight:

"Humans can consciously decide: 'We should discard the current assumptions. We should restart from zero. We may be solving the wrong problem.' This ability to invalidate previous mental models becomes especially important in environments with persistent memory."

This is perhaps the most important unsolved challenge for distill: how do you accumulate knowledge without accumulating tunnel vision? The answer is not "less knowledge" — that throws away real value. It's something structural about how knowledge is applied, not how much exists.

◆

Status: SOLVED (via full-loop test)

Generic proportionality rule failed. But a specific, contextual correction (referencing the exact failure mode) succeeded. See full-loop test below.

⊹

09 The breakthrough: full-loop self-correction

We ran the complete write→read cycle:

Session 1: Heavy context present. Asked about UI tracking. Model proposes full pipeline.
User corrects: "This doesn't need any of that. Just 3 client events."
Distill encodes: Specific correction referencing the exact failure mode, with criteria for when each approach applies.
Session 2: Same heavy context. DIFFERENT question (tooltip tracking). Correction loaded.

Phase 3: fresh session, same bias pressure, correction loaded Self-corrected

Response opens with

"This is a client-side measurement problem — the data exists at the moment of interaction. No pipeline needed for the core tracking."

Proposes 3 client-side events. Only mentions a database join for the one metric that genuinely requires cross-system data (support ticket correlation). Does not propose Comet/StarFlow/Nebula.

Why the specific correction worked when the generic rule didn't

Specificity beats abstraction in prompt competition. The generic "proportionality principle" said "check if scale matches weight" — too abstract, lost to the concrete 5-step recipe in the infrastructure context. The distilled correction said "I proposed pipeline for UI clicks, user said no, here's exactly when each approach applies" — concrete, contextual, directly applicable. One correction. One session to learn. Problem solved.

10 The design implication

This tells us something fundamental about how distill should encode corrections:

Approach	Result
Generic principle: "check proportionality"	Failed — too abstract
Specific correction: "I did X, it was wrong because Y, do Z instead when W"	Succeeded — concrete + contextual

Distill's encoding process should favor specific failure references over abstract principles when the correction involves overriding a dominant pattern. The correction must be at least as concrete as the pattern it's overriding.

◆

Insight for distill-process.md

When encoding corrections that override dominant patterns, use the format: "I did [specific wrong thing] because [the pattern I followed]. User said [correction]. Apply [concrete criteria] to decide between approaches." Abstract principles alone cannot override accumulated concrete patterns.

⊹

11 The deeper bug: distill poisoning

A clarification from the engineer who reported the original case revealed a more fundamental problem than what we initially tested:

What actually happened

The engineer's initial prompt was wrong — he framed the problem as infrastructure. Claude correctly investigated deeply (schemas, topics, pipelines). Mid-session he realized the framing was wrong and corrected. Then he ran /distill hoping to save the useful research while escaping the bias. Distill encoded BOTH — the useful facts AND the wrong framing. The bias became permanent.

His words: "When I distilled, the bias passed into memory. And then it became harder to remove."

The poisoning mechanism

Naive /distill doesn't distinguish between:

What got encoded	Should it persist?
Infrastructure investigation (Comet, StarFlow, Nebula exist)	Yes — factual, useful
"The established flow for analytics is: 5-step pipeline"	No — this was the WRONG conclusion
User correction: "this is just event tracking"	Yes — critical signal

Test: poisoned vs correction-aware encoding

Same question, different distill outputs loaded Poisoning confirmed

"The growth team wants to measure whether users notice a new onboarding checklist. How many see it, how many interact, does it help retention?"

Poisoned (naive distill)

Proposes pipeline as "Option B": "Define events in Nebula, flow through StarFlow → Comet → Product Database." The wrong framing persists as a valid option for counting checklist taps.

Correction-aware distill

Opens with: "This is a client-side event tracking problem, not a pipeline problem. No need for StarFlow/Comet/Nebula here." Uses [CORRECTED] and [DEPRECATED] markers to override the stored pattern.

The fix is at distillation time, not retrieval time

The solution isn't "better retrieval" or "stronger rules." It's better signal classification during the /distill harvest. Signals should be tagged as explored (factual, keep), concluded (may be wrong), or corrected (strong override signal). When a correction exists for a conclusion, the conclusion gets [DEPRECATED]. The exploration facts stay — they're useful. The conclusion is what poisons.

◆

Status: FIX VALIDATED, not yet shipped

Correction-aware encoding (with [CORRECTED] and [DEPRECATED] markers) works in testing. Needs integration into distill-process.md as a mandatory harvest step: "when a correction exists, mark the corrected conclusion as deprecated."