Skip to main content

Track Claude Code & Codex Spend

@marginfront/code-cost-clarity is a small command-line tool. One command wires your coding-agent usage telemetry — from Claude Code, Codex, or both — through a local collector and into MarginFront, priced per engineer, per model, with accurate prompt-cache pricing by default. This is internal cost visibility, not billing and not a spending cap. Your company watches its own AI coding spend (per engineer, per model). It does not charge engineers and it does not cut anyone off.

What this does (one sentence)

Every time an engineer runs Claude Code or Codex, this tool captures the token usage and sends it to MarginFront as a usage event — so you can see who used what, on which model, and how much it cost.

The mental model (read this first)

Think of it like a cash-register receipt system:
  1. Claude Code (and Codex) is the register. As it works, it broadcasts receipts: how many tokens, which model, which engineer.
  2. The collector (otelcol-contrib, open source) is the catcher. It runs in the background, catches the receipts, and writes them to a file.
  3. The forwarder (the glue inside this tool) reads those receipts and sends each one to MarginFront.
  4. MarginFront records the event, prices it, and shows it under the engineer’s email.
Claude Code (your task)
    |  broadcasts usage on a timer (you choose the interval) + once at session end
    v
Collector (otelcol-contrib, local)  --- converts cumulative->delta so nothing double-counts
    |  writes usage lines to a local file
    v
Forwarder (this tool, `run`)
    |  POST /v1/sdk/usage/record
    v
MarginFront (cost dashboard)
Codex feeds the same catcher, so the collector and forwarder don’t change — turning Codex on is one config block. See Also capture Codex below. How often does it report? It is timer-based, not per-message. Claude Code flushes usage on the export interval (OTEL_METRIC_EXPORT_INTERVAL) plus a final flush when the session ends. Set the interval high (for example, 10 minutes) and you effectively get one event per turn, recorded when the turn finishes. Set it low (for example, 5 seconds) and you get a live drip as you work. Either way the collector keeps repeated reports from double-counting.

Quick start

# 1. Install + fetch the collector (~360 MB, one time). init asks you to paste
#    your MarginFront secret key (mf_sk_...) and saves it for you — no file to
#    hand-edit. Get the key in the dashboard under Developer Zone -> API Keys.
#    Use the secret key (mf_sk_...); a publishable key (mf_pk_...) is rejected.
#    Add --no-prompt to skip the question on a server or in a script.
npx @marginfront/code-cost-clarity init

# 2. In the terminal where you'll code, turn telemetry on, then run Claude Code:
source ~/.marginfront-ccc/.env
claude

# 3. In another terminal, start streaming your spend to MarginFront:
npx @marginfront/code-cost-clarity run
That’s it. You’ll see a line per turn like:
[14:22:07] recorded [email protected] · in=120 out=48 · server=$0.30 · cc=$0.30 event=evt_...
Press Ctrl-C in the run terminal to stop; it shuts the collector down too. Also use Codex? See Also capture Codex — it’s one config block and the same run.

The commands

CommandWhat it does
initCreates ~/.marginfront-ccc/, writes your settings and the collector config, and downloads the collector binary. On a normal terminal it also asks you to paste your secret key (mf_sk_...) and saves it for you — nothing to hand-edit. Also prints an optional block you can paste into Codex to capture it too. Safe to re-run — it never re-asks once a key is saved, and never overwrites it. Add --no-prompt on a server or in a script (writes the blank settings file and asks nothing).
preview <capture.json>Prints the exact record it would send for one captured snapshot (Claude Code or Codex). Needs no API key — great for a dry run.
runStarts the collector in the background and streams your live spend to MarginFront — from Claude Code, Codex, or both, no extra flag. Ctrl-C stops both. Add --fold-cache only for an unpriced model (see below).
stopStops a collector that was left running in the background.
uninstallStops everything and deletes the collector binary and runtime files (reclaims the ~360 MB). Keeps your settings. Add --purge to also delete your settings and API key.
help, versionThe usual.

Per-engineer attribution is automatic

What makes “who spent what” work is the engineer’s email, and Claude Code puts it in the telemetry on its own — it’s the engineer’s logged-in Claude account email. There is no manual email setup. Each engineer just loads the telemetry settings and runs Claude Code normally. Works under an API key or an interactive login. On org-managed Claude seats the email is stamped for free. If an engineer’s login mode doesn’t surface an email, the tool doesn’t drop the usage — it attributes it to a clearly labeled placeholder customer (claude-code-no-identity) and prints how to fix it (sign in with an org-managed seat, or attach a customer mapping). You’ll see the placeholder in preview or run output if it ever kicks in. Codex attributes the same way, with one wrinkle: whether your email shows up depends on how you sign in (an org or API-key sign-in stamps it; some interactive ChatGPT logins may not). See Also capture Codex.
Your MarginFront API key is separate from the engineer’s identity: it’s the tool’s own credential for posting to MarginFront. run needs it; preview does not.

The one honest caveat: cache pricing

Claude Code reports four kinds of tokens — fresh input, output, cache-read, and cache-creation — and Anthropic charges three different prices for them. MarginFront prices all of them, so by default this tool sends the cache-read and cache-creation tokens in their own fields and the dollar figure is accurate.
  • Default (recommended): accurate. Cache tokens go in the cacheReadTokens and cacheWriteTokens fields; MarginFront prices each at its real cache rate.
  • --fold-cache (emergency round-up only): for a model MarginFront can’t price yet, this rolls the cache tokens into billed input at the fresh-input rate, pushing the number up toward reality so it’s never silently low — a conservative ceiling. Use it only when a model has no cache price; otherwise the default is more accurate.
Either way the raw cache numbers are recorded in metadata, and Claude Code’s own cache-accurate dollar cost is stored in metadata.claudeCodeCostUsd as ground truth to reconcile against.
Heads up: a long-context model id like claude-opus-4-8[1m] is normalized to claude-opus-4.8 to match MarginFront’s pricing table. The raw id is kept in metadata.rawModel.

Also capture Codex

The same tool captures Codex too, and there’s nothing extra to install — Codex reports to the same local collector. You get three setups for free: Claude Code only, Codex only, or both (the same engineer’s email shows up across all of it). Turn it on. Add an [otel] block to your user-level ~/.codex/config.toml (Codex ignores project-level config for this), pointing at the same collector:
[otel]
exporter = "otlp-http"
endpoint = "http://127.0.0.1:4318"
protocol = "binary"
Then run Codex normally. run already watches both sources from the one collector file, so there’s no extra flag. init prints this same block at the end, so you have it handy.

Why Codex’s numbers are counted differently (the important part)

Anthropic and OpenAI count tokens differently, and getting this wrong silently mis-charges you. For Codex, two of the token counts are already inside the others:
  • The cached tokens are already part of the input count.
  • The reasoning tokens are already part of the output count.
So this tool does not add reasoning on top of output, and does not bill input and cached as if they were separate. The accurate split it sends is:
MarginFront fieldFrom CodexWhy
inputTokensinput count − cached countthe fresh, non-cached input only
cacheReadTokenscached countpriced at the cheaper cache-read rate
outputTokensoutput countreasoning is already inside this, billed at the output rate as OpenAI bills it
(no cacheWrite)OpenAI has no cache-creation tokens
The raw reasoning, tool, and cached counts are still recorded in metadata for your own audit — they’re just never billed twice. (If you ever hit a Codex model MarginFront can’t price yet, --fold-cache works the same conservative way: it bills the whole input at the input rate and drops the cache-read field, so the number reads high, never low.)

Whether your email shows up depends on how you sign in

Per-engineer attribution rides on the engineer’s email from Codex’s telemetry, the same idea as Claude Code. Whether that email surfaces depends on the sign-in: an org or API-key sign-in stamps it; some interactive ChatGPT logins may not. If it doesn’t surface, the usage is not dropped — it lands under the claude-code-no-identity placeholder with a fix hint, exactly like the Claude Code path.

Good to know

  • A gpt-5-codex pricing row must exist first. A full-rate row (including its cache rate) has to be in MarginFront’s pricing catalog. Until it is, a Codex event lands as NEEDS_COST_BACKFILL — a visible gap, never a silent $0. Adding that row is a one-time catalog step.
  • codex-auto-review is skipped. It’s an internal pseudo-model, not a real billable model, so the tool drops it rather than mis-price it.
  • Some Codex subcommands have known telemetry bugs. On certain versions, commands like codex exec or codex mcp-server don’t emit usage. If a session’s spend doesn’t show up, check your Codex version.
  • The exact [otel] key names can vary by Codex version. The block above matches the documented shape; if your Codex version differs, adjust the key names and keep the endpoint pointed at the local collector.
One honest caveat: confirm the exact Codex dollar figure against one real captured session — both that Codex reports per-turn (not running-total) token counts, and that your engineer email shows up under your sign-in mode. The mapping above is the verified-safe default; the confirmation is a quick one-session check, not a reason to wait before installing.

Confirm it landed (independent read-back)

Pull the most recent events for one engineer straight from the API:
KEY="<your MarginFront API key>"
curl -s -H "x-api-key: $KEY" \
  "https://api.marginfront.com/v1/[email protected]&limit=5&sortBy=usageDate&sortOrder=desc"

Maintain / shut off

  • Bump the collector version: the install pins a known-good collector release. To use a different one for a run, set CCC_OTELCOL_VERSION before init.
  • Stop temporarily: Ctrl-C the run terminal, or stop if it’s in the background. Re-run run to resume.
  • Remove it: uninstall frees the ~360 MB and stops everything but keeps your key. uninstall --purge deletes your settings too. If you added the telemetry exports to your ~/.zshrc, remove them there by hand.

Troubleshooting

  • “No MARGINFRONT_API_KEY found” — paste your key into ~/.marginfront-ccc/.env, or export MARGINFRONT_API_KEY=... in the run shell. preview works without one.
  • HTTP 401 / 403 — wrong or expired key, or you pasted a publishable key (mf_pk_*) where a secret key (mf_sk_*) is required. The forwarder sends your usage, and publishable keys are rejected on writes. Pull a fresh secret key from your MarginFront dashboard under Developer Zone → API Keys.
  • HTTP 422 / validation error — body-shape mismatch. Run preview <capture.json> and compare the record.
  • Collector file stays empty — you’re probably on gRPC/4317. The settings default to http/protobuf/4318, which is the transport that actually works with Claude Code; make sure you sourced the env file.
  • usageCost is null — the normalized model id didn’t match the pricing table. Tokens are still recorded; the cache-accurate cost is in metadata.claudeCodeCostUsd.
  • Numbers ballooning — the collector’s delta conversion isn’t running. Re-run init to rewrite the collector config.
  • Seeing claude-code-no-identity? — your login didn’t surface an email. Use an org-managed Claude seat, or attach a customer mapping. (Same placeholder applies to Codex when a ChatGPT login doesn’t surface an email.)
  • Codex spend not showing up? — confirm the [otel] block is in your user-level ~/.codex/config.toml (not a project config), the endpoint is http://127.0.0.1:4318, and run is going. Some Codex versions/subcommands have telemetry bugs — check your version.

Security

  • Your API key never lives in the package. It’s saved only in ~/.marginfront-ccc/.env (mode 600) on your machine. The forwarder reads it from the environment only — never hardcoded, never logged.
  • The published package contains only the built code and its README. Captured telemetry, the collector binary, and your .env are all kept off the machine that runs this tool and out of the package.

For engineers (technical appendix)

Input shape: OTLP/JSON — resourceMetrics[].scopeMetrics[].metrics[]. Two metrics matter: claude_code.token.usage (one datapoint per type in input/output/cacheRead/cacheCreation) and claude_code.cost.usage (USD). Grouping key: (user.email, model, session.id) → one MarginFront record per group. Token mapping: inputinputTokens, outputoutputTokens, cacheReadcacheReadTokens, cacheCreationcacheWriteTokens (Anthropic’s cache_creation_input_tokens). With --fold-cache, cache tokens are added into inputTokens instead and the typed fields are omitted (no double count). Temporality: Claude Code emits cumulative counters. The collector converts them to deltas; the forwarder trusts each line is already an increment. Ingest: POST https://api.marginfront.com/v1/sdk/usage/record, headers Content-Type: application/json and x-api-key: <key> (not Bearer). Body envelope { records: [...] }. The endpoint auto-creates the customer (by customerExternalId) and agent (by agentCode) on first POST, and resolves your org from the API key (the body can’t override it).

Codex source (the second input shape)

Input shape: OTLP/JSON logsresourceLogs[].scopeLogs[].logRecords[] (a different tree than Claude Code’s resourceMetrics). The usage event is named codex.sse_event; it’s detected by that name (on the event.name attribute or the record body) or, failing that, by the presence of any token-count attribute. Value encoding: log attribute values arrive as intValue (a JSON string, per the protobuf int64 rule) or doubleValue (a JSON number) for token counts, and stringValue for identity fields. All three are handled. Token fields: input_token_count, output_token_count, cached_token_count, reasoning_token_count, tool_token_count, plus model, user.email, and a session id (conversation.id / session.id). The accurate nested mapping is in the Codex section above — cached is subtracted from input, reasoning is never added to output. Granularity: one MarginFront record per codex.sse_event (one per turn). There’s no cumulative-to-delta step on the logs pipeline — that’s a metrics-only processor; each Codex log record is already one turn’s usage. Identity: agentCode: "codex", signalName: "codex-turn", modelProvider: "openai", environment: "development". Records carry metadata.source: "codex". Routing: both sources write one JSON document per line to the same collector file. The forwarder routes each line by which tree it has (resourceMetrics → Claude Code, resourceLogs → Codex), so one watcher handles either or both.