№ 002 · NOTEBOOK · 2026

Notebook.

Code, writing, reading, calibration — what I'm doing now. The scoreboard updates when reality resolves.

Claims0logged

Resolved0of 0

Brier—cumulative

Best tag—lowest Brier

Worst tag—highest Brier

Nowsnapshot of this fortnight

BUILDINGPriorOdds Agent — a web shell around Claude Code's agent loop, aimed at academic paper analysis and writing.
READINGTetlock, Superforecasting. Hull, Risk Management. Whatever arxiv preprint the OpenAlex feed surfaces this week.
WATCHINGPolymarket featured markets against Manifold base rates. Calibration is the only metric I trust.
AVOIDINGHot takes on AGI timelines. There is enough work to do at 6-to-18 month horizons.
LEARNINGStochastic processes the hard way — Shreve vol. 1, then Glasserman for the practitioner side.

Workshoppushing public as each matures

Elvis Liu member 5y · public repos shipping as the workshop matures

priorodds-brainv0.1 · live on Hetzner

Personal-knowledge MCP server. Ingests my saved content across Substack, X bookmarks, Xiaohongshu, and Zhihu into one corpus, then exposes it over MCP so Claude / ChatGPT / Cursor answer grounded in my own history. Hybrid retrieval; text and image-posts unified in one 1024-d vector space.

FastMCP 3.0pgvectorDashScope Qwen3Cloudflare Access

priorodds-agentprivate · prep for OSS

Claude Code's agent loop, recompiled for the browser. Drop in an arxiv URL or research question; a sandboxed agent reads the PDF, searches OpenAlex, drafts a critical review, ships Markdown + PDF. Per-task Docker container, 2 GB cap.

Next.js 15Anthropic Agent SDKDocker sandboxPostgres

priorodds-sitelive · this site

A probability-first identity — animated prior→posterior hero, a WebGL probability field, a calibration orrery — over a fast, server-rendered editorial notebook.

Astro 5TailwindWebGLCloudflare Pages

openclawprivate experiment

Multi-agent radar. First agent watches featured Polymarket markets, cross-references arxiv and news, surfaces what the consensus seems to be missing. MCTS hypothesis search; outputs flow into the decision journal as priors.

PythonMCTSPolymarket APIcron

elvis-ledgerlive · personal

Discord → Beancount personal ledger bot. Send a line in chat; an LLM parses it to beancount, bean-check validates, the bot commits to a private repo, Fava renders the books in real time.

PythonBeancountDiscordFavaFly.io

decision-journalprivate notebook

The Bayesian decision journal proper. Every claim, prior, evidence pull, posterior, outcome. Brier score updates as reality resolves; per-tag calibration curves accumulate.

Next.jsPostgrescalibration math

Stackself-hosted on one box

chatPROD

chat.priorodds.com ↗

Self-hosted Open WebUI with custom tools, skills, and prompt templates. Multi-user, on a small Hetzner box.

agentDEV

agent.priorodds.com ↗

PriorOdds Agent — Claude Code-style web agent. Per-task Docker sandbox, real-time progress UI.

codeSSO

code.priorodds.com ↗

Cloud code-server behind Cloudflare Access + Google SSO. The driver seat for everything above.

relayINT

relay.priorodds.com

Anthropic-compatible relay. One subscription, every internal service — keeps native message format, caching, thinking blocks intact.

Calibrationhonest scoreboard

Reliability diagram

Reliability diagram · dot = bin, size ∝ count, diagonal = perfect

Cumulative Brier

Brier score · cumulative

awaiting resolution

Claims0

Resolved0

Brier—

As of2026-07-14

The board is empty right now. Each claim is logged with a prior, evidence, and a posterior; the score updates when reality resolves. Honesty before scoreboard.

Writinglong form · on the slow

The operating plan: Feynman learning, in public The system behind these field notes: two work lines, a mandatory blind restatement, an automated publishing pipeline, and six hard rules — written down so that both my readers and my AI copilot can hold me to it. 2026-07-07 The gap is the engineering: 1/20 → 11/20 on SWE-bench Lite Same model, same 20 tasks. Zero-shot resolves 1; with my harness it resolves 11. An internal auditor's field notes on why ~91% of agent performance came from scaffolding, not the model. 2026-07-06 Colophon How this notebook is built. Replace me with your first real post. 2026-06-14 Trust No AI Johann Rehberger 在 arXiv 2412.06090 把自己 2023-2024 披露给 OpenAI / Microsoft / Google / Anthropic / AWS 的几十个真实漏洞按 CIA 三元重组。这篇笔记是中文整理 + 真实 PoC 拆解,含 M365 Copilot ASCII smuggling 和 ChatGPT SpAIware 内存持久化攻击的端到端 chain。 2026-06-14

Calibration is the only honest scoreboard If a forecaster will not show you their Brier score over time, you do not have a forecaster. draft · forthcoming

Building Claude Code into a web app, the small-team way Wiring the Anthropic agent SDK behind a thin Next.js surface, sandboxing per task, and the ten obvious bugs you will hit. draft · forthcoming

A risk-control reader's notes on prediction markets Why a Polymarket question is a more honest research artifact than most policy papers — and why most quants ignore it anyway. draft · forthcoming

Notes from a self-hosting weekend What a small Hetzner box, Cloudflare proxy, and a stubborn weekend can replace from your SaaS subscriptions. Honest costs. draft · forthcoming

When a prior is too sharp A short defense of holding 35-65% in places where everyone around you holds 5-95%. With three examples from this year. draft · forthcoming

Readingbooks · papers · people

Books

SuperforecastingPhilip E. Tetlock · 2015
Risk Management and Financial InstitutionsJohn C. Hull · 2023
Stochastic Calculus for Finance ISteven Shreve · 2004
Thinking in BetsAnnie Duke · 2018

Papers

The Wisdom of Bayesian InvestmentsSSRN · 2024
Are Prediction Markets Effective?Brookings · 2023
Hindsight Neglect and the Calibration CurveJDM · 2022

People I read

Bioprior → posterior

prior

evidence

posterior

Recovering internal auditor, now mostly thinking about risk control, calibration, and what it means to forecast under genuine uncertainty.

I run a small stack of self-hosted AI tools — a chat assistant, a code workspace, a relay, and lately a Claude-Code-style agent that runs research for me while I'm asleep.

The site name is a Bayesian joke and an honest brief. If I say the odds are 60–40, that's the prior — what I'd say before reading anything. The work is figuring out where the posterior actually lands.

ReachI reply to most

EMAILhi@priorodds.com GITHUB@elvishasleft ↗ CHATchat.priorodds.com ↗ CV/cv →