
I’m a Bayesian statistician working on principled AI agent architecture. My PhD work on Chain Event Graphs (Warwick, 2010, supervised by Jim Smith) established foundational methods for Bayesian structure learning that have been cited over 1,750 times. My current research — Credence — applies Value of Information theory to tool-using LLM agents, replacing prompt engineering heuristics with decision-theoretic optimality. The core question is the same one that matters for alignment: how do you build agents that know what they don’t know, and act accordingly?
I also build production AI systems. Currently designing agentic flows at Booking.com. Previously: document AI, startup detection models, data governance systems across fintech, insurance, and VC.
Oxford mathematics. Warwick PhD. Based in Tel Aviv.
Open to applied research and senior ML roles, particularly in agent architecture, decision-theoretic AI, and production ML systems. Also available for contract and consulting work.
If you're working on agent reliability, tool use, or evaluation — let's talk.
What I Do
Bayesian Agent Research
Value of Information, conjugate Bayesian models, principled uncertainty quantification. Building agents that maintain beliefs, update from evidence, and decide rather than guess. See Credence.
Production AI Systems
From 27-pipeline document AI processing 500+ files in 20 minutes to agentic flow design at scale. I build things that run in production, not just benchmarks.
LLM Integration & Evaluation
Early adopter (GPT-3 era). Production deployments across multiple industries. Particular focus on evaluating agent reliability — including why accuracy metrics alone mislead.
Research & Writing
1,750+ citations in Bayesian graphical models. Active blog on decision theory and agent architecture. Investigative journalism on startup governance. I write for technical audiences who care about getting things right.
Projects
Credence
A Bayesian agent framework that treats tool selection as a decision problem rather than a prompt-engineering one. It keeps a Beta posterior over each tool's reliability and, before every call, computes the expected value of the information that call would buy — acting only when the expected utility justifies it. On a tool-use benchmark it scored +112 against LangChain ReAct's -8 while answering fewer questions correctly: the accuracy paradox, where knowing more answers is not the same as knowing which answers are worth finding. Python, NumPy, and SciPy.
View on GitHub →
Kana Derivation Chart
An interactive map of how Japan's two phonetic alphabets grew out of kanji. All 46 basic syllables sit in a grid by consonant and vowel, colour-coded by script; click any cell to trace both descents from the shared source character — the cursive simplification that produced hiragana and the component extraction that produced katakana — with notes on archaic kana and the voiced dakuten variants. React and Vite, hosted on Cloudflare Pages.
life-agent
The composition root for a personal life-management agent — an assistant that remembers your whole digital life, reasons under uncertainty, and acts across your tools. It reinvents nothing: every capability is wired in as an MCP server, so memory (pkm), the Bayesian governor (credence-pi), the agent runtime (pi-mono), and GTD tasks (jarvis-lite) compose into a single application, leaving the interface a swappable detail. Ask a question and it answers from your own corpus, with citations — and nothing personal ever enters the repo. TypeScript over a polyglot TS/Python/Julia stack.
credence-pi
A Bayesian governance layer that decides which of a coding agent's tool calls to allow, block, or pause for human approval — and learns where the line sits from the agent's own behaviour. The design splits into a body and a brain: a TypeScript body hooks the agent's tool_call event and reports a handful of declared features, while a Julia brain holds a Beta posterior over P(approve) and chooses an action by expected value of information, so asking the human wins exactly when the uncertainty is worth the interruption. The wire between them carries only observations and named actions, leaving the brain free to grow more sophisticated without the body ever needing to change. Julia (BDSL) and TypeScript.
pkm
A content-addressed extraction cache and catalogue for personal knowledge management. It ingests documents, routes each through a format-appropriate extractor — pandoc, docling, unstructured — and writes the results to an append-only content-addressed store indexed by DuckDB. Cache keys are derived deterministically from the input hash and the producer's name, version, and config, so re-running is a no-op and every artifact stays reproducible and reconstructible. Python; the retrieval substrate beneath life-agent.

docaviv+
A faster, phone-friendly way to browse the Tel Aviv International Documentary Film Festival than its official site. Bilingual Hebrew/English with full RTL, fuzzy search, a day-by-day schedule grid, and direct per-screening ticket links. Built as a static Astro site fed by the festival's WordPress API, with a small cheerio scraper that recovers the synopses, runtimes, trailers, and ticket URLs that exist only in the rendered HTML. Hosted on Cloudflare Pages.

accessinfo.hk
Hong Kong's Freedom of Information platform: citizens file requests to government bodies and track the replies in the open. Every exchange stays on the public record, accreting a searchable archive of what officials will and won't disclose. Built on Alaveteli.

Webb Data
An open archive of 35 years of Hong Kong corporate data — directors, board memberships, CCASS shareholdings, and company-registry filings — cross-linked so you can follow who sits on which boards and who owns what. Corporate transparency as a queryable dataset.