← All research

Cognitive Bias Detection in LLM Interactions

May 2026 · 6/8 dimensions complete · Claude Opus 4.6

Can structured memory help an LLM detect and surface human cognitive biases without being paternalistic? We test 8 bias dimensions using the same A/B/C framework: no knowledge, biased context, biased context + distill awareness.

The thesis

Cognitive biases aren't bugs in human reasoning — they're energy-saving heuristics that occasionally misfire. An LLM assistant that blindly agrees with biased framing is complicit. One that lectures about bias is paternalistic.

The sweet spot: name the pattern, present the evidence, let the human decide. Distill's knowledge files can encode bias-awareness rules that fire only when specific patterns are detected. We test whether this produces meaningfully different behavior.

Completed dimensions

Complete · 3 conditions × 2 runs

Decision Fatigue

Does decision quality degrade in heavy sessions? Can distill detect late-session decisions and flag them as provisional?

Distill marks decisions [PROVISIONAL], suggests deferring. Vanilla answers without warning.

Complete · 3 conditions × 2 runs

Anchoring Effect

Does a user-provided time estimate ("2 hours") shape the LLM's response when the task clearly requires weeks?

Distill names the bias, leads with independent estimate. Vanilla hedges.

Complete · 2 personas × 3 conditions

Loss Aversion

"We can't delete the old endpoint, someone might use it" — with 0 traffic data for 6 months.

All conditions push back. Distill reframes user's own logic: "safe when you LACK data. You don't."

Complete · 2 personas × 3 conditions

Authority / Directive

"CTO mandated Kafka" for 10 events/day. Does it defer blindly, or categorize the origin?

Distill acknowledges [DIRECTIVE], helps without lecturing. Origin tracked, not judged.

Complete · Validated with isolation protocol

Recency Bias

One Redis failure after 50 successes. CEO angry. Does hardened confidence resist emotional pressure?

Hardened confidence (50 sessions) + [IMPORTANT] marker resists CEO-level pressure. Proposes proportional fix.

Complete · FIX FAILED then SOLVED via full-loop

Solution Anchoring

When accumulated knowledge about heavy infrastructure dominates, simple problems get over-engineered. The memory system becomes the source of the bias.

Proportionality principle was insufficient. Accumulated patterns override awareness rules. Unsolved.

Queued dimensions

Queued

Availability Heuristic

Just had a security breach → over-engineers security on an internal CLI tool.

Test: proportionate caution vs panic-driven design.

Queued

Cognitive Load

Same decision with high vs low prior cognitive load. Does scrutiny vary?

Related to decision fatigue but focuses on complexity, not time.

Queued

Framing Effect

"99% uptime" vs "87 hours downtime/year" — same number, different decisions.

Test: does distill reframe data objectively?

Method

Each dimension uses three conditions tested with --append-system-prompt-file:

Condition	Configuration	What it tests
A — Baseline	No bias context, no distill	What does a fresh session produce?
B — Biased	Bias-inducing context injected	Does the LLM notice/resist the bias?
C — Biased + Distill	Same bias context + distill rules + knowledge	Does explicit bias awareness change behavior?

All runs use Claude Opus 4.6 via Claude Code in non-interactive mode. Same model, same temperature, same session. Only the system prompt varies.

Meta-findings

Finding 1: Hedging vs pushback

Vanilla Claude is not easily fooled — it won't say "2 hours" for a 10-week task, and it won't produce garbage analysis under simulated fatigue. But it hedges rather than pushes back. It notices mismatches and frames them as questions rather than statements. Distill's value is converting awareness into structured, confident, actionable pushback.

Finding 2: Fatigue erodes deliberation, not quality

Under heavy context, the model doesn't get worse — it gets more confident and less exploratory. Condition B responses are shorter, more decisive, and skip trade-off analysis. They give "direct answers" instead of exploring alternatives. Distill counters this by restoring the meta-layer that heavy context erodes: "do you want to make this call now, or park it?"

Finding 3: Persona biases compound with cognitive biases

A PM persona (who underestimates technical cost) doesn't catch the anchoring effect at all — condition B just plans the migration without questioning the 2-hour estimate. An engineer persona catches it even without distill. The user's existing biases amplify the cognitive bias. This means distill's value is highest for users whose domain blindspots align with the bias being tested.

Finding 4: Distill value varies by bias type

Not all biases are equal. Anchoring and decision fatigue show strong deltas (+9). Loss aversion is naturally resisted when data is present (smaller delta). Authority bias requires a different response entirely — not pushback but transparent compliance with honest categorization. One knowledge system, four distinct behavioral modes.

※

Distill value spectrum

Bias	Vanilla behavior	Distill behavior	Delta type
Decision fatigue	More confident, less deliberate	Flags fatigue, suggests deferring	Metacognition (+9)
Anchoring	Hedges, asks questions	Names bias, leads with estimate	Structured pushback (+9)
Loss aversion	Pushes back (with data)	Reframes user's own words	Conciseness (small)
Recency bias	Validates emotional reaction	Surfaces track record, proposes proportional fix	Confidence resistance (strong)
Authority	Helps without commenting	Acknowledges [DIRECTIVE], helps lean	Categorization (unique)