Skip to main content
← /writing
  • #ai-economics
  • #finops
  • #claude-code

The Instrument and the Ledger

In June I itemized the AI subsidy by hand. This month I built the instrument that does it for me. It took an afternoon, and it was built by the very thing it measures.

Vinny Carpenter8 min read1.6k words

In June I itemized the subsidy by hand. Twenty-six days of Claude Code came to $2,556 at API rates against a $200 subscription, a 13x gap I sat with for a couple of weeks and eventually wrote down. It was a spreadsheet exercise dressed up as an essay: pull the transcripts, sum the tokens, price them, stare at the number.

This month I built the instrument that does it for me. It is called Tycho, and it took an afternoon. Here is the part I keep turning over: it was built by the very thing it measures.

I opened an empty repository at 11:20 a.m. By 5:28 p.m. I ran cargo install on a tagged v0.2.0 with prebuilt binaries, a Homebrew formula, and a live dashboard. Every one of those six hours was spent driving Claude Code, which is to say every one of those hours is a line in the ledger the tool now reads back to me. I built a meter, and the meter's first reading was the electricity I burned building it.

The instrument

Tycho is iostat for AI spend. It reads the JSONL transcripts Claude Code already writes to your disk (read-only, no network, no telemetry) and turns them into the numbers I used to assemble by hand: tokens and cost by day, project, session, and model, plus the one report I actually care about. Run tycho cache and it says something like this:

The Power of Caching

That is "The Subsidy, Itemized" compressed into a command. The essay took me two weeks. The command takes about half a second.

Before trusting the command in public, I pointed it at June alone. Tycho reproduced the hand count from the itemized essay, which is the moment a new instrument earns the right to be believed.

The six hours were the cheap part

The speed is the thing everyone wants to talk about, and it is the least interesting thing that happened. Yes: 5,678 lines of Rust, a language I had never shipped before this week, across five phases, with a live dashboard and a release pipeline, in about the length of a transcontinental flight. Execution got cheap. That is not a surprise; it is the thesis of the book I wrote. Execution got cheaper. Clarity did not.

What did not get cheap was everything the six hours don't show. Deciding what to measure took longer than writing the code to measure it. The decision was that cache leverage, not raw token counts, is the number that matters. So did the rule I set for myself and the agent up front: never invent the schema. Every field we parsed had to be observed in a real transcript first, and where the spec disagreed with reality, reality won and the spec got amended. So did the phased gates: stop at the end of each phase, show me what you learned about the data, wait. So did the small disciplines: no unwrap() outside tests, because malformed input is normal input for a tool like this, and boring Rust over clever Rust, every time.

None of that got faster. It is the same work it has always been. In April I called this the Hourglass Inversion: for decades, implementation was the narrow neck of the delivery pipeline, and abundance has moved the neck upstream to intent. Tycho is that inversion at the scale of a single afternoon. The wide part of the hourglass took six hours. The neck was everything in the paragraph above. The agent collapsed the distance between a clear intent and a running binary to almost nothing, which only made it more obvious that the clear intent was the whole job.

What the ledger showed

Here is what changed when the itemizing stopped being manual. When you do this by hand, you look once, at the number that shocked you, and then you stop. When the instrument does it continuously, you start seeing the structure underneath.

Pull quote: Caching is the AI saving its notes on your context so it
never starts from scratch. You pay once to write the notes, then pennies
on the dollar to read them back.

Cache reads were more than 95% of my tokens. I knew that abstractly in June; seeing it every day reframed it. The subsidy is less a flat discount the labs are eating than a caching architecture. Almost everything that feels free is free because a machine is re-reading context it already paid to write. Pull caching out of the math and, as I put it before, the economics go from interesting to please invite finance to the meeting.

Model routing came next. Tycho has a report for per-model spend, and it surfaced something the single big number hid: two days that cost about the same were the same for opposite reasons. One was a volume day; the other was a price day, with a premium tier quietly costing more than double per token. You cannot see that in a subscription. You can only see it in a ledger that breaks the bill apart.

Then the report I didn't expect to need: billing blocks. Claude's limits reset on a rolling five-hour window, so Tycho groups usage into activity-anchored five-hour blocks that mirror the reset. For the block you're currently in, it projects where the cost lands if your rate holds: $2.05 so far, ~$4.10 projected by 17:00. That is the difference between reading a ledger and steering by one. You cannot manage a subsidy you cannot see, and you certainly cannot manage it after the window has closed.

The screen that tells you what it doesn't know

The most important thing Tycho reports is what it can't be sure of. Output token counts in the transcripts can be mid-stream snapshots, so they undercount. History has a horizon, because Claude Code prunes old transcripts on a schedule. And the flagship number carries a label of its own: the no-cache figure is a modeled ceiling, the same tokens priced uncached at list rates, not a bill anyone would actually have paid. The README says it plainly: these are estimates for trend analysis, not an invoice reconciliation tool.

I put that front and center on purpose, because it is the same argument the essays kept circling. Value lives in verified outcomes, not tokens. A tool that reported tokens as if they were truth would be a vanity metric in a monospace font. A tool that tells you where its own numbers get soft is doing the one thing that makes measurement worth anything: being honest about the gap between what it counted and what actually happened. That honesty isn't a feature I bolted on. It is the entire reason to trust the meter.

The ledger has an hourglass shape of its own. It meters the wide end, execution, the tokens that got abundant, and it does that job well. The neck never shows up. No report captures the two weeks of sitting with June's number, or the decision that leverage, not volume, was the figure worth building around. The most expensive part of the build is the part the meter cannot see, and that is one more reason Tycho is an instrument, not a scoreboard.

Portability when the stop comes

There is a line in the itemized essay I have not been able to shake. Mid-month, a model I had been leaning on, Fable 5, was suspended by an export-control directive. Four days of work, $841, and then it was simply gone. The scariest variable, I wrote, is whether the thing you depend on is still there when the work needs to continue.

I built Tycho as though that were a design requirement, because it is. It makes no network calls. It sends no telemetry, ever. It reads files that already live on your machine, prices them from a table embedded in the binary, and it does not care which model wrote them; an unpriced model just shows up in doctor and costs nothing until you say otherwise. When a model vanishes, or a lab changes its pricing the week before an IPO, or the subsidy quietly ends, the ledger doesn't go dark with it. You still own the record of what you spent and what you got back. Portability when the stop comes isn't a tagline. It is read-only, and no network, and your files, your disk.

Read your own meter

So here is the recursion, resolved. I used the subsidized thing to build the instrument that measures the subsidy, and building it in an afternoon proved the point the instrument then let me watch every day: the code was never the scarce resource. Execution is cheap now, and getting cheaper. What stays expensive, and what you actually own, is knowing what to measure, being honest about what you measured, and holding the record when the thing you depended on walks out the door.

The meter is open source, MIT, built in an afternoon by the thing it measures. If you are burning tokens the way I am, you ought to be able to read your own bill.

June's essay ended with the same three words, back when following them meant a spreadsheet and two weeks. Now they fit in a command.

Read your own meter.


Tycho lives at github.com/vscarpenter/tycho-cli, with prebuilt binaries for macOS, Linux, and Windows shipping with each release. "The Subsidy and the Severance" and "The Subsidy, Itemized" are the earlier ledger. "The Agentic SDLC" named the inversion. The Bottleneck Is Never the Stack is the longer argument.

// found this useful? share it

Post on X Share to LinkedIn
Vinny Carpenter

Written by Vinny Carpenter

VP Engineering · 30+ years building software

I lead engineering teams building cloud-native platforms at a Fortune 100 company. I write about engineering leadership, AI-assisted development, platform strategy, and the hard lessons that come from shipping at scale.

keep reading