Essays

Three Types and a Funeral for Your Inference Library

What would it take to build an agent whose behaviour is derived from a few fundamentals the way physics is derived from conservation laws? Three types, four axioms, and a refusal to add anything else.

juliabayesianmachine-learningaiessays

Keeping the Coding Agent on the Straight and Narrow

A companion to the PKM Phase 1 post. The foundation was built by two AIs — Claude.ai for design, Claude Code for implementation — with a spec as the contract between them. Ten SPEC revisions in four days, and what the rules caught that 'pragmatic' would have missed.

essaysaipython

A Content-Addressed Foundation for Personal Knowledge

Eleven million words of personal documents, four canonical questions none of Khoj, Paperless-ngx, Obsidian, or Karpathy's LLM Wiki can answer, and a content-addressed extraction foundation that takes content-addressing seriously. Phase 1 of a multi-phase build.

essaysdataaipython

Ninety-Six Percent Cheaper and Slightly Better

Credence-proxy sits between an agent and its LLM providers, learns which model is good for which category, and routes accordingly. On an OpenClaw benchmark it cut cost by 96% and latency by 52% while raising quality by 1.24 points. The mechanism is one equation.

essaysbayesianmachine-learningai

The Prompting Gradient

Each prompting technique helps. Reasoning traces, strategy guidance, cross-question history --- each one improves accuracy and score. None of them closes the gap with a Bayesian agent that does not use language at all. The ceiling exists because descriptions of calculations are not calculations.

essaysbayesianmachine-learningai

The Agent That Invents Its Own Rules

Most agents are given a fixed set of decision rules. Credence's second tier generates candidate rules from sensor features, scores them by complexity, and lets the posterior decide which structures are worth keeping. This is program synthesis as Bayesian inference.

juliaessaysbayesianmachine-learningai

Sixty-Two Percent Correct and Winning by a Hundred and Twenty Points

A Bayesian decision-theoretic agent scores lower on accuracy than every LLM variant it competes against --- and beats the best of them by 120 points. The explanation requires thinking about something that LLM benchmarks typically refuse to think about.

essaysbayesianmachine-learningai

The Bitter Lesson Has No Utility Function

I wrote about decision theory fading from AI. Hacker News said I was annoyed at Rich Sutton's Bitter Lesson. I wasn't. But the misreading proves the point.

essaysbayesianmachine-learningai
Why We Stopped Using the Mathematics That Works

Why We Stopped Using the Mathematics That Works

Someone asked why decision theory stopped being widely used in AI. The answer involves ImageNet, academic departments, and the seductive power of not having to specify your objectives.

essaysbayesianmachine-learningai
Agentic AI Is Neither Intelligent Nor an Agent

Agentic AI Is Neither Intelligent Nor an Agent

I built a Bayesian agent and set it against LangChain on a tool-use benchmark. LangChain got more answers right and still lost — by 120 points.

pythonbayesianmachine-learningaiessays
On Owning Your Data

On Owning Your Data

Why I reverse-engineered a cheap Bluetooth scale to keep my health data out of the cloud

essaysprivacydata