Skip to main content

Integrate MarginFront with Non-LLM Tools

Not everything your agent does is an LLM call. Your agent might send SMS messages via Twilio, scrape web pages, generate PDFs, send emails, process images, or transcribe audio. MarginFront tracks all of it through the same event endpoint. What this does for you: Every tool your agent uses — LLM or not — shows up in one dashboard. You see the full cost picture, not just the AI part.

How non-LLM events differ from LLM events

Same endpoint (POST /v1/usage/record), same SDK method (mf.usage.record()). The difference is which fields carry the “how much” information:
Event type”How much” fieldExample
LLMinputTokens + outputTokens523 input tokens, 117 output tokens
Non-LLMquantity1 SMS sent, 12 pages scraped
For non-LLM events, model and modelProvider are descriptive labels you choose. Use something readable — they’ll show up in your dashboard and analytics. inputTokens and outputTokens are always optional; leave them out for non-LLM work.

Prerequisites

npm install @marginfront/sdk
import { MarginFrontClient } from "@marginfront/sdk";

// One client, reused for all events (LLM and non-LLM)
const mf = new MarginFrontClient(process.env.MF_API_SECRET_KEY!);
You don’t need to create the agent or signal first. When you fire your first event with a new agentCode or signalName, MarginFront creates them automatically. The same goes for customerExternalId. You can rename and enrich any of them in the dashboard later.

Example 1: Twilio SMS

Your agent sends a text message to a customer’s phone number. You want to track each SMS as one unit of usage.
import twilio from "twilio";

const twilioClient = twilio(
  process.env.TWILIO_ACCOUNT_SID,
  process.env.TWILIO_AUTH_TOKEN,
);

// Send the SMS first
const smsResult = await twilioClient.messages.create({
  body: "Your order has shipped! Tracking: ABC123",
  from: process.env.TWILIO_PHONE_NUMBER,
  to: customerPhone,
});

// Then tell MarginFront about it
await mf.usage.record({
  customerExternalId: customerId,
  agentCode: "notification-bot",
  signalName: "sms-sent",
  model: "sms-send", // descriptive label -- you pick this
  modelProvider: "twilio", // who provided the service
  quantity: 1, // one SMS was sent
});
Why model: 'sms-send'? For non-LLM services, model is just a label that helps you identify what happened when you look at the dashboard later. Pick something readable. Other good options: 'outbound-sms', 'sms-notification', 'transactional-sms'.

Example 2: Web scraping (variable quantity)

Your research agent scrapes a website and returns multiple pages of content. The number of pages varies per job, so quantity changes each time.
// Scrape the target URL
const pages = await scraper.scrape(targetUrl);

// Process the scraped content (whatever your agent does with it)
const summary = await processPages(pages);

// Track the usage -- quantity is how many pages were scraped
await mf.usage.record({
  customerExternalId: customerId,
  agentCode: "research-bot",
  signalName: "pages-scraped",
  model: "web-scraper", // descriptive label for the tool
  modelProvider: "internal", // "internal" works for tools you built yourself
  quantity: pages.length, // could be 3, could be 50 -- depends on the site
});
Why modelProvider: 'internal'? If the tool is something you built (not a third-party service), use 'internal' as the provider. MarginFront won’t find it in any pricing table, so the cost will be null — which is fine. You’re tracking the signal quantity for billing purposes, not calculating LLM costs.
Fire ONE event when the whole scrape job finishes — not one event per page. quantity: pages.length is how you tell MarginFront this one job handled N pages. Looping to fire one event per page would multiply your invoice, flood your analytics, and burn API calls for no reason. Same rule for minutes of audio, images in a batch, messages in a thread — one event per outcome, quantity does the counting. See Choosing your signal name and quantity for the full rule.

Example 3: PDF generation

Your reporting agent generates a multi-page PDF for a customer. You bill by the number of pages in the report.
// Generate the PDF
const pdf = await generateReport(customerData);

// Send it to the customer
res.setHeader("Content-Type", "application/pdf");
res.send(pdf.buffer);

// Track the usage
await mf.usage.record({
  customerExternalId: customerId,
  agentCode: "report-bot",
  signalName: "report-pages",
  model: "pdf-generator", // descriptive label
  modelProvider: "internal", // you built this tool
  quantity: pdf.pageCount, // 1-page report costs less than a 20-page report
});

Example 4: Email sending (via Resend, SendGrid, etc.)

import { Resend } from "resend";

const resend = new Resend(process.env.RESEND_API_KEY);

await resend.emails.send({
  from: "[email protected]",
  to: recipientEmail,
  subject: "Your weekly report",
  html: reportHtml,
});

await mf.usage.record({
  customerExternalId: customerId,
  agentCode: "notification-bot",
  signalName: "emails-sent",
  model: "transactional-email",
  modelProvider: "resend",
  quantity: 1,
});

Example 5: Mixing LLM and non-LLM in one workflow

Real agents often use multiple services for a single business outcome. A place-report agent generates a report about a location by searching Google for context, then asking Gemini to analyze + write it, then calling the Places API to attach map metadata. From the customer’s perspective that’s ONE report about ONE place. From your cost perspective three underlying services contributed: a search API, an LLM, and a non-LLM map lookup. Track it as ONE event with a services array. Each entry becomes a per-service cost line under one parent event. The dashboard shows one event with a rolled-up total; the Cost-by-service chart splits it across the three services.
import { GoogleGenAI } from "@google/genai";
import { Client as MapsClient } from "@googlemaps/google-maps-services-js";
import { MarginFrontClient } from "@marginfront/sdk";

const gemini = new GoogleGenAI({});
const maps = new MapsClient({});
const mf = new MarginFrontClient(process.env.MF_API_SECRET_KEY);

async function generatePlaceReport(customerId: string, placeName: string) {
  // Step 1: Google Search API for background context on the place
  const searchResults = await googleSearch.query({ q: placeName, num: 5 });

  // Step 2: Gemini analyzes the search results and writes the report
  const geminiResponse = await gemini.models.generateContent({
    model: "gemini-2.5-pro",
    contents: `Write a market report about ${placeName}. Sources: ${JSON.stringify(searchResults)}`,
  });
  const reportText = geminiResponse.text ?? "";

  // Step 3: Places API attaches map metadata (address, hours, rating)
  const placesData = await maps.placesNearby({
    params: {
      location: placeName,
      radius: 1000,
      key: process.env.GOOGLE_MAPS_KEY,
    },
  });

  // Step 4: Tell MarginFront about the WHOLE place report (one event, three services)
  await mf.usage.record({
    customerExternalId: customerId,
    agentCode: "place-report-bot",
    signalName: "place-reports",
    // Top-level quantity stays signal-level: ONE place report.
    // Per-service volume lives inside each services[] entry.
    quantity: 1,
    services: [
      {
        // Google Search API call(s) for background context
        model: "google-search",
        modelProvider: "google",
        quantity: 1, // 1 search query
      },
      {
        // Gemini analyzed the search results and wrote the report.
        // Track tokens AND the call count: useful for analytics + future
        // per-call pricing once the catalog supports it.
        model: geminiResponse.modelVersion ?? "gemini-2.5-pro",
        modelProvider: "google",
        inputTokens: geminiResponse.usageMetadata?.promptTokenCount ?? 0,
        outputTokens: geminiResponse.usageMetadata?.candidatesTokenCount ?? 0,
        quantity: 1, // 1 LLM call
      },
      {
        // Places API queries for map metadata
        model: "google-maps-places",
        modelProvider: "google",
        quantity: placesData.data.results.length, // e.g., 3 places resolved
      },
    ],
  });

  return reportText;
}
ONE event shows up in your dashboard. Cost rolls up across all three services: Google Search (mapped to your service_pricing if you’ve added it, null until then), Gemini tokens (calculated from the catalog), and Google Maps Places (same as Search). The “Cost by service” chart on the Cost tab shows where the total split.
One outcome, one event. The most common mistake here is firing three mf.usage.record calls (one per service). That used to be the workaround before multi-service shipped: it triplicated the report on your dashboard, made margin math harder, and could have multi-counted the customer’s invoice when all three events shared a signal. With services[], you fire one event per business outcome regardless of how many underlying services contributed.

When to use services[] vs single-service shape

Use the single-service shape (top-level model + modelProvider) when one event uses one underlying service. This is the 90% case for chatbots: an agent answers one question with GPT-4o, sends one SMS, transcribes one audio file.
// Single-service: one event, one service
await mf.usage.record({
  customerExternalId: customerId,
  agentCode: "cs-bot-v2",
  signalName: "messages",
  model: "gpt-4o",
  modelProvider: "openai",
  inputTokens: 523,
  outputTokens: 117,
});
Use services[] when one event is backed by multiple underlying services contributing to the same business outcome (the place-report example above is three services; cold-outreach pipelines are often four or more). The two shapes are mutually exclusive: send model + modelProvider OR send services[], never both, never neither. If you mix shapes, MarginFront rejects the request with a clear error pointing at the fix.

LLM services can carry tokens AND quantity together

The Gemini entry in the example above sets inputTokens, outputTokens, AND quantity: 1 simultaneously. That’s intentional. LLM services[] entries can carry all three, which lets you:
  • Track the call count alongside token totals (e.g., when retries or chain-of-thought intermediate prompts cause one outcome to trigger multiple LLM calls).
  • Layer per-call pricing on top of per-token pricing once your service_pricing catalog supports it (audit in progress).
  • Compare “calls per outcome” against “tokens per outcome” in your own reporting.
If you don’t care about call counts, omit quantity on LLM entries. Cost calculation is tokens-only for LLM services today; per-call pricing is on the roadmap.

Multiple calls of the same model in one event

If your agent makes two Gemini calls for one report (a draft pass + a refinement pass that both contribute to the same outcome), you have two equally valid ways to record it: Option A: list each call as its own services[] entry. Each becomes a distinct cost line. Use this when you want per-call resolution.
services: [
  {
    model: "gemini-2.5-pro",
    modelProvider: "google",
    inputTokens: 4500,
    outputTokens: 800,
    quantity: 1,
  },
  {
    model: "gemini-2.5-pro",
    modelProvider: "google",
    inputTokens: 1200,
    outputTokens: 400,
    quantity: 1,
  },
];
Option B: aggregate into one entry with summed tokens and quantity = call count. Cleaner if you don’t need per-call resolution.
services: [
  {
    model: "gemini-2.5-pro",
    modelProvider: "google",
    inputTokens: 5700, // 4500 + 1200
    outputTokens: 1200, // 800 + 400
    quantity: 2, // 2 LLM calls aggregated
  },
];
Both produce the same rolled-up parent cost.

What you pick vs what MarginFront calculates

For non-LLM events, here’s what’s yours to define and what MarginFront handles:
FieldYou set itMarginFront calculates it
modelYes — pick a descriptive labelNo
modelProviderYes — the service name or 'internal'No
quantityYes — how many units of workNo
inputTokensOptional (leave out for non-LLM)No
outputTokensOptional (leave out for non-LLM)No
Service costNoYes, if model+provider is in the pricing table. Otherwise null.
RevenueNoYes, from pricing plan: quantity x price_per_unit

Including tokens AND quantity

Some events straddle both worlds. For example, an image generation call uses tokens (for the prompt) but also produces a quantity (number of images):
await mf.usage.record({
  customerExternalId: customerId,
  agentCode: "creative-bot",
  signalName: "images-generated",
  model: "dall-e-3",
  modelProvider: "openai",
  quantity: 4, // four images generated
  inputTokens: 85, // tokens used in the prompt (optional)
  // no outputTokens for image generation
});
All three fields (quantity, inputTokens, outputTokens) are always optional. Use whichever ones are relevant to the work that happened.

Cost tracking for non-LLM tools

MarginFront’s built-in catalog covers 300+ LLM models plus a curated set of non-LLM services (Twilio SMS and voice, Google Maps, SendGrid, Hunter, Exa, and more). If your non-LLM tool isn’t in the catalog yet, the model+provider won’t match anything. That’s fine:
  • The event saves with cost = null (never dropped, never zero).
  • Revenue is still calculated from your pricing plan (quantity x price_per_unit).
  • If you want MarginFront to calculate the actual cost of a non-LLM tool (e.g., the SMS provider you use), open the Needs Attention flow on the dashboard and map your model name to its catalog entry. Cost-side mapping is optional. Many users only care about the revenue side for non-LLM tools.

Fire-and-forget still applies

Just like with LLM events, the SDK’s default fire-and-forget mode means:
  • If MarginFront is down, events retry automatically from a local buffer.
  • If there’s a validation error, the SDK logs a warning and moves on.
  • Your agent never stalls waiting for MarginFront.
This is especially important for non-LLM tools where the work is already done (SMS already sent, PDF already generated) — there’s no point blocking on the tracking call.

Reading back what you earned

You sent events. Now you want numbers: revenue, cost, margin, MRR. Three canonical paths, all point to the same data.

Via SDK (inside your app)

import { MarginFrontClient } from "@marginfront/sdk";
const mf = new MarginFrontClient(process.env.MF_API_SECRET_KEY!);

// Last 30 days of revenue + cost + margin for one customer
const end = new Date();
const start = new Date(end.getTime() - 30 * 24 * 60 * 60 * 1000);

const revenue = await mf.analytics.revenue({
  startDate: start.toISOString(),
  endDate: end.toISOString(),
  customerId: "cus_123",
});

const cost = await mf.analytics.costBreakdown({
  startDate: start.toISOString(),
  endDate: end.toISOString(),
  customerId: "cus_123",
});

const mrr = await mf.analytics.mrr({ customerId: "cus_123" });

Via REST (from any language or curl)

curl "https://api.marginfront.com/v1/analytics/revenue?customerId=cus_123&startDate=2026-03-25&endDate=2026-04-24" \
  -H "Authorization: Bearer $MF_API_SECRET_KEY"

curl "https://api.marginfront.com/v1/analytics/cost?customerId=cus_123&startDate=2026-03-25&endDate=2026-04-24" \
  -H "Authorization: Bearer $MF_API_SECRET_KEY"

curl "https://api.marginfront.com/v1/analytics/mrr?customerId=cus_123" \
  -H "Authorization: Bearer $MF_API_SECRET_KEY"

Via MCP (from an AI agent like Claude, Cursor, or Copilot)

If you’ve connected MarginFront’s MCP server to your AI coding assistant, the agent can call canonical analytics tools directly:
Tool: get_customer_revenue
Args: { customerExternalId: "acme-001", startDate: "2026-03-25", endDate: "2026-04-24" }

Tool: get_cost_metrics
Args: { startDate: "2026-03-25", endDate: "2026-04-24", customerExternalId: "acme-001" }

Tool: get_mrr
Args: { customerExternalId: "acme-001" }
All three paths return the same canonical numbers. Pick whichever fits your workflow.