Track Claude Code & Codex Spend
@marginfront/code-cost-clarity is a small command-line tool. One command wires your coding-agent usage telemetry — from Claude Code, Codex, or both — through a local collector and into MarginFront, priced per engineer, per model, with accurate prompt-cache pricing by default.
This is internal cost visibility, not billing and not a spending cap. Your company watches its own AI coding spend (per engineer, per model). It does not charge engineers and it does not cut anyone off.
What this does (one sentence)
Every time an engineer runs Claude Code or Codex, this tool captures the token usage and sends it to MarginFront as a usage event — so you can see who used what, on which model, and how much it cost.The mental model (read this first)
Think of it like a cash-register receipt system:- Claude Code (and Codex) is the register. As it works, it broadcasts receipts: how many tokens, which model, which engineer.
- The collector (
otelcol-contrib, open source) is the catcher. It runs in the background, catches the receipts, and writes them to a file. - The forwarder (the glue inside this tool) reads those receipts and sends each one to MarginFront.
- MarginFront records the event, prices it, and shows it under the engineer’s email.
OTEL_METRIC_EXPORT_INTERVAL) plus a final flush when the session ends. Set the interval high (for example, 10 minutes) and you effectively get one event per turn, recorded when the turn finishes. Set it low (for example, 5 seconds) and you get a live drip as you work. Either way the collector keeps repeated reports from double-counting.
Quick start
run terminal to stop; it shuts the collector down too.
Also use Codex? See Also capture Codex — it’s one config block and the same run.
The commands
| Command | What it does |
|---|---|
init | Creates ~/.marginfront-ccc/, writes your settings and the collector config, and downloads the collector binary. On a normal terminal it also asks you to paste your secret key (mf_sk_...) and saves it for you — nothing to hand-edit. Also prints an optional block you can paste into Codex to capture it too. Safe to re-run — it never re-asks once a key is saved, and never overwrites it. Add --no-prompt on a server or in a script (writes the blank settings file and asks nothing). |
preview <capture.json> | Prints the exact record it would send for one captured snapshot (Claude Code or Codex). Needs no API key — great for a dry run. |
run | Starts the collector in the background and streams your live spend to MarginFront — from Claude Code, Codex, or both, no extra flag. Ctrl-C stops both. Add --fold-cache only for an unpriced model (see below). |
stop | Stops a collector that was left running in the background. |
uninstall | Stops everything and deletes the collector binary and runtime files (reclaims the ~360 MB). Keeps your settings. Add --purge to also delete your settings and API key. |
help, version | The usual. |
Per-engineer attribution is automatic
What makes “who spent what” work is the engineer’s email, and Claude Code puts it in the telemetry on its own — it’s the engineer’s logged-in Claude account email. There is no manual email setup. Each engineer just loads the telemetry settings and runs Claude Code normally. Works under an API key or an interactive login. On org-managed Claude seats the email is stamped for free. If an engineer’s login mode doesn’t surface an email, the tool doesn’t drop the usage — it attributes it to a clearly labeled placeholder customer (claude-code-no-identity) and prints how to fix it (sign in with an org-managed seat, or attach a customer mapping). You’ll see the placeholder in preview or run output if it ever kicks in.
Codex attributes the same way, with one wrinkle: whether your email shows up depends on how you sign in (an org or API-key sign-in stamps it; some interactive ChatGPT logins may not). See Also capture Codex.
Your MarginFront API key is separate from the engineer’s identity: it’s the tool’s own credential for posting to MarginFront.runneeds it;previewdoes not.
The one honest caveat: cache pricing
Claude Code reports four kinds of tokens — fresh input, output, cache-read, and cache-creation — and Anthropic charges three different prices for them. MarginFront prices all of them, so by default this tool sends the cache-read and cache-creation tokens in their own fields and the dollar figure is accurate.- Default (recommended): accurate. Cache tokens go in the
cacheReadTokensandcacheWriteTokensfields; MarginFront prices each at its real cache rate. --fold-cache(emergency round-up only): for a model MarginFront can’t price yet, this rolls the cache tokens into billed input at the fresh-input rate, pushing the number up toward reality so it’s never silently low — a conservative ceiling. Use it only when a model has no cache price; otherwise the default is more accurate.
metadata, and Claude Code’s own cache-accurate dollar cost is stored in metadata.claudeCodeCostUsd as ground truth to reconcile against.
Heads up: a long-context model id likeclaude-opus-4-8[1m]is normalized toclaude-opus-4.8to match MarginFront’s pricing table. The raw id is kept inmetadata.rawModel.
Also capture Codex
The same tool captures Codex too, and there’s nothing extra to install — Codex reports to the same local collector. You get three setups for free: Claude Code only, Codex only, or both (the same engineer’s email shows up across all of it). Turn it on. Add an[otel] block to your user-level ~/.codex/config.toml (Codex ignores project-level config for this), pointing at the same collector:
run already watches both sources from the one collector file, so there’s no extra flag. init prints this same block at the end, so you have it handy.
Why Codex’s numbers are counted differently (the important part)
Anthropic and OpenAI count tokens differently, and getting this wrong silently mis-charges you. For Codex, two of the token counts are already inside the others:- The cached tokens are already part of the input count.
- The reasoning tokens are already part of the output count.
| MarginFront field | From Codex | Why |
|---|---|---|
inputTokens | input count − cached count | the fresh, non-cached input only |
cacheReadTokens | cached count | priced at the cheaper cache-read rate |
outputTokens | output count | reasoning is already inside this, billed at the output rate as OpenAI bills it |
| (no cacheWrite) | — | OpenAI has no cache-creation tokens |
metadata for your own audit — they’re just never billed twice. (If you ever hit a Codex model MarginFront can’t price yet, --fold-cache works the same conservative way: it bills the whole input at the input rate and drops the cache-read field, so the number reads high, never low.)
Whether your email shows up depends on how you sign in
Per-engineer attribution rides on the engineer’s email from Codex’s telemetry, the same idea as Claude Code. Whether that email surfaces depends on the sign-in: an org or API-key sign-in stamps it; some interactive ChatGPT logins may not. If it doesn’t surface, the usage is not dropped — it lands under theclaude-code-no-identity placeholder with a fix hint, exactly like the Claude Code path.
Good to know
- A
gpt-5-codexpricing row must exist first. A full-rate row (including its cache rate) has to be in MarginFront’s pricing catalog. Until it is, a Codex event lands asNEEDS_COST_BACKFILL— a visible gap, never a silent$0. Adding that row is a one-time catalog step. codex-auto-reviewis skipped. It’s an internal pseudo-model, not a real billable model, so the tool drops it rather than mis-price it.- Some Codex subcommands have known telemetry bugs. On certain versions, commands like
codex execorcodex mcp-serverdon’t emit usage. If a session’s spend doesn’t show up, check your Codex version. - The exact
[otel]key names can vary by Codex version. The block above matches the documented shape; if your Codex version differs, adjust the key names and keep the endpoint pointed at the local collector.
One honest caveat: confirm the exact Codex dollar figure against one real captured session — both that Codex reports per-turn (not running-total) token counts, and that your engineer email shows up under your sign-in mode. The mapping above is the verified-safe default; the confirmation is a quick one-session check, not a reason to wait before installing.
Confirm it landed (independent read-back)
Pull the most recent events for one engineer straight from the API:Maintain / shut off
- Bump the collector version: the install pins a known-good collector release. To use a different one for a run, set
CCC_OTELCOL_VERSIONbeforeinit. - Stop temporarily: Ctrl-C the
runterminal, orstopif it’s in the background. Re-runrunto resume. - Remove it:
uninstallfrees the ~360 MB and stops everything but keeps your key.uninstall --purgedeletes your settings too. If you added the telemetry exports to your~/.zshrc, remove them there by hand.
Troubleshooting
- “No MARGINFRONT_API_KEY found” — paste your key into
~/.marginfront-ccc/.env, orexport MARGINFRONT_API_KEY=...in therunshell.previewworks without one. - HTTP 401 / 403 — wrong or expired key, or you pasted a publishable key (
mf_pk_*) where a secret key (mf_sk_*) is required. The forwarder sends your usage, and publishable keys are rejected on writes. Pull a fresh secret key from your MarginFront dashboard under Developer Zone → API Keys. - HTTP 422 / validation error — body-shape mismatch. Run
preview <capture.json>and compare the record. - Collector file stays empty — you’re probably on gRPC/4317. The settings default to http/protobuf/4318, which is the transport that actually works with Claude Code; make sure you
sourced the env file. usageCostis null — the normalized model id didn’t match the pricing table. Tokens are still recorded; the cache-accurate cost is inmetadata.claudeCodeCostUsd.- Numbers ballooning — the collector’s delta conversion isn’t running. Re-run
initto rewrite the collector config. - Seeing
claude-code-no-identity? — your login didn’t surface an email. Use an org-managed Claude seat, or attach a customer mapping. (Same placeholder applies to Codex when a ChatGPT login doesn’t surface an email.) - Codex spend not showing up? — confirm the
[otel]block is in your user-level~/.codex/config.toml(not a project config), the endpoint ishttp://127.0.0.1:4318, andrunis going. Some Codex versions/subcommands have telemetry bugs — check your version.
Security
- Your API key never lives in the package. It’s saved only in
~/.marginfront-ccc/.env(mode 600) on your machine. The forwarder reads it from the environment only — never hardcoded, never logged. - The published package contains only the built code and its README. Captured telemetry, the collector binary, and your
.envare all kept off the machine that runs this tool and out of the package.
For engineers (technical appendix)
Input shape: OTLP/JSON —resourceMetrics[].scopeMetrics[].metrics[]. Two metrics matter: claude_code.token.usage (one datapoint per type in input/output/cacheRead/cacheCreation) and claude_code.cost.usage (USD).
Grouping key: (user.email, model, session.id) → one MarginFront record per group.
Token mapping: input→inputTokens, output→outputTokens, cacheRead→cacheReadTokens, cacheCreation→cacheWriteTokens (Anthropic’s cache_creation_input_tokens). With --fold-cache, cache tokens are added into inputTokens instead and the typed fields are omitted (no double count).
Temporality: Claude Code emits cumulative counters. The collector converts them to deltas; the forwarder trusts each line is already an increment.
Ingest: POST https://api.marginfront.com/v1/sdk/usage/record, headers Content-Type: application/json and x-api-key: <key> (not Bearer). Body envelope { records: [...] }. The endpoint auto-creates the customer (by customerExternalId) and agent (by agentCode) on first POST, and resolves your org from the API key (the body can’t override it).
Codex source (the second input shape)
Input shape: OTLP/JSON logs —resourceLogs[].scopeLogs[].logRecords[] (a different tree than Claude Code’s resourceMetrics). The usage event is named codex.sse_event; it’s detected by that name (on the event.name attribute or the record body) or, failing that, by the presence of any token-count attribute.
Value encoding: log attribute values arrive as intValue (a JSON string, per the protobuf int64 rule) or doubleValue (a JSON number) for token counts, and stringValue for identity fields. All three are handled.
Token fields: input_token_count, output_token_count, cached_token_count, reasoning_token_count, tool_token_count, plus model, user.email, and a session id (conversation.id / session.id). The accurate nested mapping is in the Codex section above — cached is subtracted from input, reasoning is never added to output.
Granularity: one MarginFront record per codex.sse_event (one per turn). There’s no cumulative-to-delta step on the logs pipeline — that’s a metrics-only processor; each Codex log record is already one turn’s usage.
Identity: agentCode: "codex", signalName: "codex-turn", modelProvider: "openai", environment: "development". Records carry metadata.source: "codex".
Routing: both sources write one JSON document per line to the same collector file. The forwarder routes each line by which tree it has (resourceMetrics → Claude Code, resourceLogs → Codex), so one watcher handles either or both.
