4 studies · Core capability
Knowledge Retrieval
Does it USE knowledge correctly? A/B testing, memory degradation, confidence scaling, and operational tool reliability.
Headline: tool reliability test shows 0/5 vs 5/5 — strongest delta. Past failures become proactive prevention.
6/8 dimensions · Cognitive protection
Cognitive Bias Detection
Does it PROTECT the human? Anchoring, fatigue, loss aversion, authority, recency, and solution anchoring. Tested across engineer and PM personas.
Headline: distill poisoning discovered and solved. Specificity beats abstraction. Confidence gives permission to push back.
2/5 scenarios · Knowledge composition
Engineering vs Philosophical Frameworks
Can philosophical principles (Stoic/Pragmatist/Dialectical) complement engineering axioms? Tested on ambiguous decisions with no clear engineering answer.
Early signal: hybrid outperforms both pure approaches. Philosophy reframes; engineering acts.
7-system benchmark · User personalization
User Model: Always-On Preference Enforcement
Competitive analysis across 7 memory systems revealed user model as distill's weakest category. Root cause: preferences behind lazy-load gates. Fix: always-on inline preferences with enforcement hooks.
Headline: the winning system's advantage was placement, not architecture. Inlining 15 lines of preferences in the rules file eliminates retrieval failure.