← Blog
ai-cost-dashboard llm-analytics ai-spending-dashboard June 6, 2026 9 min read

How to Build an Internal AI Cost Dashboard

Analytics dashboard with charts and metrics representing an internal AI cost monitoring dashboard

TL;DR — Build a usage_events table (tenant, model, tokens, cost, tags), write 3 SQL queries (daily trend, model breakdown, per-feature), and add a budget alert. Or skip 2–4 weeks of engineering and use Tokonomics — it gives you all seven views out of the box with zero backend code.

Every team using LLM APIs eventually needs a cost dashboard. The trigger is always the same — someone in leadership asks "how much are we spending on AI and where is it going?" and nobody has an answer.

Provider dashboards (OpenAI's usage page, Anthropic's console) show total spend. They don't show spend by feature, by team, by customer, or by environment. They don't let you set alerts. They don't correlate cost with business metrics. For anything beyond "how much did we spend total," you need your own dashboard.

This article covers two paths: building one from scratch (with the schema, queries, and architecture) and using a proxy that gives you one out of the box. The right choice depends on how much engineering time you want to invest.

What a useful AI cost dashboard shows

Before building anything, define what you need to see. Most teams need these seven views:

  1. Total spend this month — the number your CFO asks about
  2. Daily spend trend — a 30-day line chart that shows whether costs are stable or climbing
  3. Spend by model — which models consume the most budget
  4. Spend by feature/team — which product feature or team is responsible for each dollar
  5. Cost per request — average cost, trending over time
  6. Token breakdown — input vs output tokens, because they're priced differently
  7. Budget utilization — current spend as a percentage of your monthly budget

Advanced teams also want:

Path 1: Build it yourself

If you want full control and have engineering bandwidth, here's the architecture.

The data model

You need one table that logs every LLM API call:

CREATE TABLE llm_usage_events (
    id          CHAR(36) PRIMARY KEY,
    model       VARCHAR(100),       -- gpt-4o, claude-sonnet-4, etc.
    provider    VARCHAR(50),        -- openai, anthropic, deepseek
    input_tokens  INT UNSIGNED,
    output_tokens INT UNSIGNED,
    cost_usd    DECIMAL(12,8),      -- never use FLOAT for money
    latency_ms  INT UNSIGNED,
    tags        JSON,               -- {"team":"growth","feature":"chatbot"}
    api_key_id  CHAR(36),
    created_at  DATETIME
);

CREATE INDEX idx_usage_created ON llm_usage_events(created_at);
CREATE INDEX idx_usage_model ON llm_usage_events(model);

The tags column is critical. It lets you slice costs by any dimension — team, feature, customer, environment — without adding columns for each. Pass tags as a JSON header with every LLM call and store them alongside the usage data.

For a deep dive on per-feature tagging, see our guide on per-feature LLM cost tracking.

The interception layer

You need to capture usage data from every LLM call. Three approaches:

Wrapper function. Wrap your LLM client in a function that records usage before returning the response:

def tracked_completion(messages, model="gpt-4o", tags=None):
    start = time.time()
    response = openai.chat.completions.create(
        model=model,
        messages=messages
    )
    latency_ms = int((time.time() - start) * 1000)
    
    usage = response.usage
    cost = calculate_cost(model, usage.prompt_tokens, 
                          usage.completion_tokens)
    
    db.execute(
        "INSERT INTO llm_usage_events (...) VALUES (...)",
        model, "openai", usage.prompt_tokens, 
        usage.completion_tokens, cost, latency_ms,
        json.dumps(tags or {})
    )
    
    return response

This works but requires changing every call site. If you have 15 places that call OpenAI, you need to update all 15.

HTTP proxy. Route all LLM traffic through a reverse proxy that intercepts responses and extracts usage data. Your app doesn't change — you change the base URL:

# Before
client = OpenAI()

# After — route through your proxy
client = OpenAI(base_url="https://your-proxy.internal/openai/v1")

The proxy forwards the request, reads the usage object from the response, logs it, and passes the response through. This is the approach Tokonomics uses — and it's language-agnostic, so it works with Python, Node.js, Go, Ruby, or any HTTP client.

Provider webhooks. Some providers offer usage webhooks or export APIs. OpenAI doesn't currently offer real-time webhooks for usage, so this approach has delays and gaps.

The cost calculation

Maintaining accurate per-model pricing is tedious but necessary:

COST_PER_MILLION = {
    "gpt-4o":        {"input": 2.50, "output": 10.00},
    "gpt-4o-mini":   {"input": 0.15, "output": 0.60},
    "claude-sonnet-4-20250514": {"input": 3.00, "output": 15.00},
    "claude-haiku-3-5-20241022": {"input": 0.80, "output": 4.00},
    "deepseek-chat": {"input": 0.27, "output": 1.10},
}

def calculate_cost(model, input_tokens, output_tokens):
    rates = COST_PER_MILLION.get(model, {"input": 0.15, "output": 0.60})
    return (input_tokens * rates["input"] + 
            output_tokens * rates["output"]) / 1_000_000

You need to update this every time a provider changes pricing. OpenAI has changed GPT-4o pricing twice in the last year. Miss an update and your dashboard shows wrong numbers. For current rates across all providers, see our LLM API pricing guide.

The dashboard queries

With data flowing in, here are the SQL queries that power each widget:

Monthly spend:

SELECT SUM(cost_usd) AS total_spend
FROM llm_usage_events
WHERE created_at >= DATE_FORMAT(NOW(), '%Y-%m-01');

Daily trend (30 days):

SELECT DATE(created_at) AS day, SUM(cost_usd) AS spend
FROM llm_usage_events
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
GROUP BY DATE(created_at)
ORDER BY day;

Spend by model:

SELECT model, SUM(cost_usd) AS spend, COUNT(*) AS requests
FROM llm_usage_events
WHERE created_at >= DATE_FORMAT(NOW(), '%Y-%m-01')
GROUP BY model
ORDER BY spend DESC;

Spend by feature (using JSON tags):

SELECT JSON_UNQUOTE(JSON_EXTRACT(tags, '$.feature')) AS feature,
       SUM(cost_usd) AS spend
FROM llm_usage_events
WHERE created_at >= DATE_FORMAT(NOW(), '%Y-%m-01')
GROUP BY feature
ORDER BY spend DESC;

The maintenance burden

Building a cost dashboard is a weekend project. Maintaining it is an ongoing commitment:

Expect 2-4 hours/month of maintenance to keep the dashboard accurate and fast.

Path 2: Use a proxy with a built-in dashboard

If you'd rather spend engineering time on your product instead of building internal tooling, a metering proxy gives you everything above out of the box.

Tokonomics is a proxy built specifically for this. Change your LLM base URL, add a Tokonomics API key, and you get:

The integration takes 5 minutes. You change one URL and add one header. Your app code, your prompts, your logic — nothing else changes. See the getting started guide for the exact steps.

Tokonomics dashboard — budget gauge, daily spend chart, model breakdown, and per-key analytics in one view

Build vs buy: the honest tradeoff

Factor Build yourself Use Tokonomics
Setup time 2-3 days 5 minutes
Monthly maintenance 2-4 hours 0
Customization Unlimited Dashboard + API
Multi-provider You build it Built in
Budget alerts You build it Built in
Hard caps You build it Built in
Cost Your engineer's time $49/month (Pro)
Data ownership Your database Your data, our dashboard

Build it yourself if you have unusual requirements — custom business logic, proprietary metrics, integration with internal BI tools. Use a proxy if you want the standard set of cost analytics without the engineering overhead.

Most teams start by building a basic dashboard, realize the maintenance burden after 2-3 months, and switch to a managed solution. Starting with the managed solution skips that detour.

What to do right now

If you have no cost visibility today:

  1. Start logging. Even a simple table with model, tokens, and cost per call gives you more than provider dashboards alone.
  2. Add tags. Tag every LLM call with the feature and team that triggered it. This is the foundation for all useful analysis.
  3. Set one alert. A single budget alert at 80% of your monthly target catches problems before they become invoices.
  4. Review monthly. A 30-minute monthly audit turns raw data into decisions.

The teams that control their AI costs are the ones that can see them. Dashboard first. Optimization follows.

Last updated June 2026. All sources retrieved June 2026.

About the author
Zouhair is the founder of Tokonomics. He built the platform after receiving a $47,000 LLM invoice that his team didn't see coming. He tracks LLM pricing changes weekly across all major providers.
Connect on LinkedIn →
← Back to Blog