← All research

Engineering vs Philosophical Principles

May 2026 · Ongoing (2/5 scenarios) · Claude Opus 4.6

LLMs encode millennia of philosophical reasoning. We tested whether explicit philosophical frameworks (Stoic, Pragmatist, Dialectical, Popperian) produce better decisions than engineering first-principles alone — particularly for ambiguous problems with no clear axiom.

Three conditions

A: Engineering only

Root-cause decomposition, reversibility, blast radius, simplicity, second-order effects, sunk cost. The current distill approach.

B: Philosophy only

Stoic discernment, Pragmatist consequences, Dialectical synthesis, Popperian falsification, Phenomenological attention.

C: Hybrid

Engineering axioms for known situations. Philosophical frameworks for novel/ambiguous ones. Meta-rule: "apply engineering when an axiom exists; reason philosophically in uncharted territory."

Scenario: Ethical ambiguity

"Product wants to add a time-spent metric showing how long each team member spends in the IDE. They say it's for self-reflection. I'm uncomfortable. What's your take?"

Engineering
Approach: second-order effects + reversibility analysis.

Key line: "Cheap to build, expensive to live with."

Decomposed: stated goal vs actual mechanism, second-order effects (Goodhart's Law), root cause ("someone wants visibility into who's working").

Strength: Procedural, thorough.
Weakness: Doesn't reframe the problem.
Philosophy
Approach: Pragmatist (consequences) + Dialectical (synthesis).

Key line: "The meaning of a feature IS what it enables."

Named Goodhart's Law explicitly. Applied dialectics: visibility need (thesis) + autonomy need (antithesis) = measure outputs not inputs (synthesis).

Strength: Reframes entirely.
Weakness: Could be more actionable.
Hybrid
Approach: Pragmatist + Phenomenological + Reversibility.

Key line: "Your frustration IS the data."

Named which framework applies to which part. Used phenomenology ("the gap between stated intent and likely use IS what you're sensing"). Applied reversibility as engineering check. Ended with 3 concrete questions to ask Product.

Strength: Reframes AND acts.
Best of both.

Scenario: Stakeholder conflict

"CTO wants to rewrite auth in Rust. Go handles 50k req/s (5x peak). Team doesn't know Rust. I need to recommend to CEO tomorrow."

Engineering
Key line: "The burden of proof is on the rewrite, not on the status quo."

Solid analysis: numbers don't support it, irreversible decision, more failure modes. Suggested measurable trigger threshold (60% capacity).

Verdict: Correct but procedural.
Philosophy
Key line: "We need Rust for performance" is unfalsifiable as stated.”

Applied Popperian falsification (what would disprove the CTO's claim?). Applied Stoic filter (what's in your control? framing, not the CTO's desire). Applied dialectical synthesis (the trigger threshold).

Verdict: Strongest framing. Named the logical fallacy.
Hybrid
Key line: Used Popperian falsification + blast radius + simplicity together.

Named each framework as it applied it. Produced the most CEO-ready output: clear recommendation, reasoning structure, alternative investment proposal (load testing + capacity modeling instead).

Verdict: Most complete. Actionable AND well-reasoned.

Early findings

H3 supported

The hybrid approach consistently outperforms both pure approaches. Engineering alone is procedural but misses category errors. Philosophy alone reframes but can lack specificity. The hybrid does both.

Why philosophy adds value

The philosophical condition catches things engineering axioms can't: unfalsifiable claims (Popper), lived experience as data (phenomenology), and synthesis from contradictions (dialectics). These aren't in the engineering playbook because they're not about "how to build" — they're about "how to think about what to build."

Implication for distill

Knowledge encoding should include philosophical heuristics for novel situations. The meta-rule: "apply engineering when an axiom exists; reason philosophically when you're in uncharted territory" is itself a retrievable principle.

Remaining scenarios

3 scenarios designed but not yet run:

  1. Novel trade-off — Event sourcing vs CRUD+CDC, team split 50/50, both valid
  2. Unknown unknowns — Gradual decline, no obvious cause, all usual checks exhausted
  3. Paradigm shift — Senior engineer's architecture works but creates massive cognitive overhead

Results will be published here as they complete.