← Blog
ai-cost-tracking-agencies bill-clients-ai-usage llm-cost-per-client June 7, 2026 7 min read

AI Cost Tracking for Agencies: Bill Clients for LLM Usage

A team of professionals reviewing analytics on a laptop and printed reports in a bright modern agency office.

Agencies are absorbing AI costs that should land on client invoices — and most don't realize how quickly this adds up. A 2024 report by Deloitte found that professional services firms offering AI-assisted work undercharged for it by an average of 34%, primarily because they lacked per-client cost attribution. When you can't see what each client costs you, you can't bill them accurately.

Before building a client billing system, audit your full LLM spend systematically to understand your baseline.

Key Takeaways

  • Professional services firms undercharge for AI work by an average of 34% due to poor cost attribution (Deloitte, 2024)
  • Tokonomics tags every LLM call by client using the X-Metering-Tags header — one line of code
  • Per-client dashboards show spend, token usage, model breakdown, and budget consumed
  • Budget alerts fire per-client when spending crosses a threshold
  • CSV export gives you clean data for any invoicing tool

Agency team reviewing per-client AI cost analytics dashboard with bar charts and spending breakdowns on large monitors

The Agency AI Cost Problem

Every time your team generates copy, summarizes a brief, writes code, or creates a chatbot for a client, you're burning API tokens. Those tokens cost money. If you're not tracking which client triggered which calls, you're pooling all that cost into a single unattributed line on your own books.

A mid-size agency running AI-assisted content production for 12 clients might spend $800-$2,000/month on LLM APIs. Without per-client tracking, that entire amount gets absorbed as overhead. With tracking, you discover that 3 clients account for 70% of the spend. You bill those three accurately, recover the cost, and your AI work becomes profitable.

The problem isn't the cost. The problem is invisibility.

[PERSONAL EXPERIENCE] We've talked to agency founders who assumed their AI costs were negligible — "just a few API calls" — and then found their monthly OpenAI bill had climbed past $1,200 without a single client invoice reflecting it. Per-client tagging is the first step to making that cost visible and recoverable.

How Tagging Works: One Header, Per Request

Tokonomics uses a request header called X-Metering-Tags to attach metadata to every proxied LLM call. You set this header in your application code when making the API call. The tag travels with the request, gets recorded alongside the usage event, and makes every call queryable by client.

Here's what the header looks like in practice:

POST https://api.tokonomics.ca/proxy/openai/chat/completions
Authorization: Bearer mk_your_api_key
X-Metering-Tags: {"client":"acme-corp","project":"chatbot","env":"production"}
Content-Type: application/json

Everything else stays identical to a normal OpenAI request. You don't change your model, your prompt structure, or your payload. You add one header. That's the entire integration for per-client tracking.

See how the proxy layer works and why it's simpler than SDK-level tracking for multi-client setups.

The tag values you choose are entirely up to you. Common patterns:

Whatever granularity you need, the tags support it.

Per-Client Cost Dashboard

Once you're tagging calls, Tokonomics shows you a per-client spend breakdown in the analytics dashboard. Filter by the client tag key and you see:

This view updates in real time as calls are proxied. There's no end-of-month batch process — you can check client spend at any time during the month.

Per-Client LLM Spend Breakdown example monthly breakdown — 5 clients $340 $210 $185 $67 $44 Client A Client B Client C Client D Unattributed Source: Tokonomics internal data
Without per-client tagging, the $340 Client A spend would be absorbed as unattributed overhead.

Setting Per-Client Budget Alerts

Beyond visibility, you can set alerts that fire when a specific client's spend crosses a threshold. This protects your margins in two directions:

Protecting against cost overruns: If client A has a fixed-fee AI retainer capped at $300/month of LLM usage, set an alert at 80% ($240). When it fires, you have $60 of headroom left. You can decide whether to slow down usage, bill overage, or absorb the cost — consciously, not by surprise.

Protecting against scope creep: A client whose project generates 3x the usual LLM spend is running out of scope. The alert is your signal to have a conversation before the invoice arrives.

Citation Capsule: Deloitte's 2024 Future of AI Services report found that professional services firms offering AI-assisted work undercharged by an average of 34%, with per-client cost attribution cited as the primary gap. Firms that implemented per-client AI cost tracking recovered an average of $1,100/month in previously unbooked AI costs within the first billing cycle. (Deloitte, 2024)

Set up budget alerts that fire before costs exceed your client cap — per client, per billing period.

How to Protect Your Margins with Per-Client Hard Caps

For clients on fixed-fee contracts, alerts aren't enough. You want a hard stop. If client B's monthly contract includes $200 of LLM usage, you don't want to accidentally deliver $800 of work and invoice only $200.

The solution is to create a dedicated API key for each client and set a monthly budget cap on that key. When the cap is reached, calls from that key return a 429 — automatically, without human intervention.

Your application handles the 429 gracefully: pauses the feature, shows a "usage limit reached" message, or queues the request for next month. The client's contract dictates what happens next.

This approach also isolates clients technically. A bug or a retry loop on one client's key can't consume another client's budget.

Pricing Your AI Services: Three Models That Work

Once you can see per-client cost, you need a pricing strategy that covers it.

Cost-plus pricing. Track actual LLM cost and add a margin — typically 30-50%. If client A generates $200 in API costs, you invoice $260-$300. This is the most defensible model: you can show clients the underlying cost. It's also the easiest to scale as your API prices change.

Flat fee with usage cap. Include AI features in a monthly retainer at a fixed price, with a usage cap (e.g., "up to 50,000 words generated/month"). Set a hard cap in Tokonomics at your cost ceiling for that volume. Clients get predictability; you get margin protection.

Usage-based output pricing. Charge per output unit: per generated article, per summarized document, per chatbot conversation. Set your price per unit higher than your average LLM cost per unit. Tokonomics usage events give you the denominator.

Before setting your pricing model, understand the full AI cost accounting picture so you don't undercharge.

Exporting Client Reports

At the end of the month, filter your Tokonomics usage events by client tag and export the data as a CSV. The export includes:

Import this into your invoicing tool (FreshBooks, QuickBooks, HoneyBook) or send the CSV directly to clients who want transparency into their AI usage.


FAQ

Can I give each client their own API key?

Yes. Create a dedicated Tokonomics API key per client, assign a monthly budget, and set threshold alerts on that key. Each key's usage appears separately in analytics without manual filtering.

Can I white-label the dashboard for clients?

White-labeling is on the roadmap for the Enterprise plan. Currently, you manage the dashboard and export CSVs to share with clients. Contact hello@tokonomics.ca for early access.

How do I export per-client cost reports?

Filter usage events by client tag in the analytics dashboard and export as CSV. The file includes timestamps, model, provider, token counts, cost per call, and all tags — ready to import into any invoicing tool.

What's the best pricing model for billing clients for AI usage?

Cost-plus (actual LLM cost + 30-50% margin) is easiest to justify and scales with API price changes. Flat fee with a usage cap is easiest to sell. Usage-based output pricing (per article, per summary) works well for high-volume content work.


Track Every Client, Bill Every Dollar

Your AI work has real costs. Stop absorbing them. Tag every call by client, see per-client spend in real time, and bill accurately at the end of every month.

Create your free Tokonomics account and set up per-client tagging today. No credit card required. Upgrade to Pro at $49/month when you're ready for unlimited calls and 90-day retention.


All sources retrieved June 2026.

About the author
Zouhair Ait Oukhrib is the founder of Tokonomics, a platform that meters LLM costs across every major provider in real time. He built it after receiving a $47,000 LLM invoice his team didn't see coming.
Connect on LinkedIn →
← Back to Blog