← Blog
flowise langchain cost-tracking June 11, 2026 5 min read

How to Track LLM Costs in Flowise AI Workflows

Visual AI workflow builder representing Flowise chatflow cost tracking

TL;DR: In Flowise's ChatOpenAI or ChatAnthropic nodes, set the Base Path to https://tokonomics.ca/proxy/openai and use your Tokonomics API key. Every chain execution is metered — cost per chatflow, per model, per day. No code changes to your flows.


The Cost Blind Spot in Flowise

Flowise is a powerful open-source tool for building LLM applications visually. Drag a ChatOpenAI node, connect a prompt template, add a vector store — and you have a working AI chatbot.

But Flowise has no concept of cost. You can see how many times a chatflow ran. You cannot see that those runs consumed $280 in tokens, or that your RAG chain is sending 8,000 input tokens per query because the retriever pulls too many chunks.

AI agents in Flowise are especially dangerous for costs. An agent that calls tools, reasons over multiple steps, and retries on failure can consume 10-50x more tokens than a simple chat completion. Without cost tracking, a single agent chatflow can burn through $500 before anyone notices.


How the Integration Works

Flowise's LLM nodes (ChatOpenAI, ChatAnthropic) support a Base Path override. Set it to the Tokonomics proxy URL, and every LLM call routes through the proxy before reaching the provider.

Before:  Flowise → api.openai.com → response
After:   Flowise → tokonomics.ca/proxy/openai → api.openai.com → response

The proxy records tokens, cost, model, and latency for each call, then returns the response unchanged. Your chatflows work exactly as before.


Step-by-Step: ChatOpenAI Node

  1. Open your Flowise chatflow
  2. Click on the ChatOpenAI node
  3. Under Connect Credential, create a new OpenAI API credential:
    • API Key: mk_your_tokonomics_key
  4. In the node settings, find Base Path (under Additional Parameters)
  5. Set it to: https://tokonomics.ca/proxy/openai
  6. Save and test

Every call from this node is now metered. Check your Tokonomics dashboard to see the cost.


Step-by-Step: ChatAnthropic Node

  1. Click on the ChatAnthropic node
  2. Create a new Anthropic credential:
    • API Key: mk_your_tokonomics_key
  3. Set Base URL to: https://tokonomics.ca/proxy/anthropic
  4. Save and test

The same pattern works for any LLM node that supports a base URL override.


Tracking Agent Costs

Flowise agents (OpenAI Function Agent, ReAct Agent, Conversational Agent) make multiple LLM calls per user query. A single agent interaction might involve:

That's 4-5 LLM calls for one user message. At scale, agent costs compound fast.

With Tokonomics, each of these sub-calls is recorded individually. The dashboard shows total cost per interaction and helps you identify which agent chains are expensive and why.

Common agent cost issues:


RAG Chain Cost Optimization

RAG (Retrieval-Augmented Generation) chains in Flowise are a common source of hidden costs. The cost comes from input tokens — the more chunks your retriever returns, the more tokens you send to the LLM.

Chunks retrieved Avg chunk size Input tokens Cost per query (GPT-4o)
3 500 tokens 1,500 $0.00375
5 500 tokens 2,500 $0.00625
10 500 tokens 5,000 $0.0125

Reducing from 10 chunks to 5 cuts input cost by 50%. Most RAG applications get diminishing returns past 3-5 chunks. Use the Tokonomics dashboard to see your actual input token counts — if they're consistently high, reduce your retriever's topK parameter.

For a deep dive on prompt efficiency, see our cost optimization strategies guide.


Multi-Model Chatflows

Flowise makes it easy to chain multiple models. A common pattern:

  1. Classifier (GPT-4o-mini) — route the query to the right handler
  2. RAG chain (Claude Sonnet) — answer complex queries with context
  3. Summarizer (GPT-4o-mini) — condense the response

With Tokonomics, each model call is tracked separately. The dashboard breaks down cost by model, so you can see that Claude Sonnet accounts for 80% of your spend and optimize accordingly.


Self-Hosted Flowise: Same Integration

Flowise is typically self-hosted. The Tokonomics proxy is a URL — it doesn't matter where Flowise runs. As long as your server can make outbound HTTPS requests to tokonomics.ca, the integration works.

For Flowise running in Docker:


Frequently Asked Questions

Does this work with Flowise's streaming mode?

Yes. The Tokonomics proxy supports streaming — chunks are forwarded as they arrive from the provider. The user experience is identical to direct streaming.

Can I track costs per chatflow?

Yes, by using different Tokonomics API keys per chatflow, or by adding custom headers via Flowise's HTTP configuration. Each API key gets its own usage breakdown in the dashboard.

What about LangChain nodes in Flowise?

Flowise wraps LangChain components. Any LangChain node that calls an LLM (ChatOpenAI, ChatAnthropic, etc.) can be routed through the proxy by setting the base URL. The proxy is transparent to LangChain — it sees a standard API endpoint.

How much latency does the proxy add?

Approximately 30ms per call (benchmark data). For LLM calls that take 500ms-3,000ms, this is unnoticeable.


Get Started

  1. Create a free Tokonomics account (100 calls/month free)
  2. Copy your API key
  3. Set the Base Path in your ChatOpenAI/ChatAnthropic nodes
  4. Test one chatflow — check the dashboard
  5. Set a budget alert to catch overspend early

All sources retrieved June 2026. Pricing: GPT-4o at $2.50/1M input tokens (OpenAI Pricing), Claude Sonnet 4 at $3.00/1M input tokens (Anthropic Pricing).

About the author
Founder & CTO at Tokonomics. Built the proxy after a $47,000 LLM invoice blindsided his team. Tracks LLM pricing weekly across 9 providers.
Connect on LinkedIn →
← Back to Blog