Skip to main content

Cost Control — Track and Optimize LLM Spending

Access: Tenant Admins only

The Cost Control module answers the question: How much are my AI agents costing in LLM tokens, and where is the money going?

Cost Control Overview

What is Cost Control?

Every time an AI agent processes a request, it consumes tokens from the underlying LLM (like Gemini, GPT, Claude, etc.). These tokens cost money. Cost Control gives you complete visibility into:

  • Total estimated cost across all agents
  • Cost breakdown by agent, user, and session
  • Token usage patterns (input vs. output vs. thinking tokens)
  • Cost trends over time

Why it matters:

  • LLM costs can grow quickly as usage increases
  • Without visibility, you can't optimize spending
  • Some agents or users may consume disproportionately more tokens
  • Understanding token breakdowns helps you write more efficient prompts

Demo Video

Note: TraptureIQ tracks token usage accurately, but your actual LLM API costs are billed directly by your cloud provider (Google Cloud, OpenAI, etc.) — not through TraptureIQ. The costs shown here are estimates based on published model pricing.


How to Use the Cost Control Page

Step 1: Open Cost Control

Click Cost Control in the sidebar.

Step 2: Review the Key Metrics

At the top of the page, you'll see summary cards:

MetricWhat It Shows
Total Estimated CostThe estimated dollar cost of all LLM usage in the selected time range
Total TokensTotal number of tokens consumed across all requests
Input TokensTokens from user prompts and system instructions
Output TokensTokens from agent responses
Thinking TokensTokens used for internal reasoning (for models that support it)

Step 3: Analyze the Charts

Cost Over Time (Line Chart)

  • Shows how spending trends over the selected time range
  • What to look for: Upward trends, unexpected spikes, or usage patterns by day/hour

Cost by Agent (Bar Chart)

  • Shows which agents are the most expensive
  • What to look for: One agent dominating costs may indicate an inefficient prompt, verbose responses, or excessive tool usage

Token Breakdown per Agent (Stacked Bar Chart)

  • Shows the ratio of input vs. output vs. thinking tokens per agent
  • What to look for: High thinking tokens may indicate the agent is doing excessive internal reasoning; high output tokens may indicate verbose responses

Token Trend Over Time

  • Shows how token usage changes over time
  • What to look for: Correlate with usage growth — is cost growing faster than usage? That suggests decreasing efficiency.

Step 4: Review the Session Cost Table

A detailed table showing the cost and token breakdown for each individual chat session:

ColumnWhat It Shows
SessionSession name (auto-generated from the first message)
AgentWhich agent was used
UserWho initiated the session
Total TokensTotal tokens consumed in the session
Input / Output / ThinkingToken breakdown by type
Estimated CostEstimated dollar cost for the session
LatencyResponse time

This table is sortable — click any column header to sort by that value. Sort by cost (descending) to find the most expensive sessions.

Cost Control Session Table

Step 5: Filter the Data

FilterWhat It Does
AgentShow costs for a specific agent only
UserShow costs for a specific user's activity
Time RangePredefined ranges or custom date range

Cost Control Overview


Understanding Token Types

Token TypeWhat It IsHow to Optimize
Input TokensTokens from the system prompt + user message + conversation history sent to the LLMShorten your system prompt, limit conversation history length
Output TokensTokens in the agent's generated responseAdd instructions like "be concise" to your system prompt
Thinking TokensTokens used for internal reasoning (chain of thought) — only for models that support itSome reasoning is good, but excessive thinking wastes tokens. Review traces to check reasoning efficiency

Common Use Cases

ScenarioWhat to Do
"How much did we spend this month?"Set time range to "Last 30 days" and check the Total Estimated Cost card
"Which agent costs the most?"Check the "Cost by Agent" bar chart
"Why is one agent so expensive?"Filter by that agent, then check the Token Breakdown chart. Are thinking tokens high? Are output tokens excessive?
"I need to reduce costs"Sort the session table by cost (descending) to find expensive sessions, then analyze their token breakdown. Optimize system prompts for efficiency.
"Is our cost growing faster than usage?"Compare the Cost Over Time chart with the Usage Dashboard's Request Volume chart

Tips for Beginners

  • Check weekly — A quick weekly review helps you catch cost anomalies early.
  • Focus on the top-cost agents — Optimizing the most expensive agent has the biggest impact.
  • Optimize prompts — The system prompt is sent with every request. Shortening it by 100 tokens saves 100 tokens × every request.
  • Use the Analyser — Draft prompts in the Analyser to check token counts before deploying.
  • Sort by cost — The session table sorted by cost quickly shows your most expensive interactions.

Token Budget

The Token Budget feature lets you set monthly token limits for your workspace — so you always know when usage is getting too high, before it becomes a surprise on your cloud bill.

Important: Token budgets are soft limits. Setting a budget does not stop your agents from responding. If a budget is exceeded, you will receive an alert email — agents continue operating normally.

What You Can Set

Budget TypeWhat It Covers
Tenant-wide BudgetTotal tokens across all agents in your workspace for the month
Per-Agent BudgetToken limit for a specific individual agent

You can set both at the same time — for example, a workspace-wide limit plus tighter limits on high-usage agents.


How to Set a Token Budget

Step 1: Open the Token Budget tab

Go to Cost Control in the sidebar, then click the Token Budget tab.

Token Budget tab


Step 2: Set a Tenant-wide Budget

In the Tenant-wide Budget section on the left:

  1. Enter a monthly token limit (e.g. 5000000 for 5 million tokens)
  2. Make sure Active is checked
  3. Click Save Tenant Budget

Step 3: Set a Per-Agent Budget (optional)

In the Per-Agent Budget section on the right:

  1. Select an agent from the dropdown
  2. Enter a monthly token limit for that agent
  3. Make sure Active is checked
  4. Click Save Agent Budget

Repeat for any other agents you want to track individually.


Viewing Current Usage

Once budgets are set, the Current Month Usage vs Budget section shows you:

  • How many tokens have been used so far this month
  • The progress bar colour tells you at a glance how close you are to the limit:
    • 🟢 Green — usage is healthy (under 80%)
    • 🟡 Amber — approaching the limit (80–99%)
    • 🔴 Red — budget exceeded (100%+)

Agent budgets are shown as a compact grid so you can monitor many agents at once.


Alert Emails

When a budget is exceeded, TraptureIQ automatically sends an alert email to all Tenant Admins in your workspace. The email includes:

  • Which budget was exceeded (tenant-wide or which agent)
  • How many tokens were used vs. the budget
  • The percentage over budget

To avoid inbox noise, alerts are sent at most once every 24 hours per budget — even if usage continues to grow.

Budget status view


Removing a Budget

Click Remove (or on agent cards) next to any budget to delete it. The budget is removed immediately and no further alerts will be sent for it.


Common Questions

Will my agents stop working if the budget is exceeded? No. Budgets are informational — your agents will continue responding normally. The budget is purely for your awareness and cost tracking.

Can I set both a tenant budget and per-agent budgets? Yes. Both work independently. You can have a workspace-wide limit and separate tighter limits per agent at the same time.

What counts towards the token budget? All token usage by your agents for the current calendar month — including input tokens (prompts), output tokens (responses), and thinking tokens (internal reasoning).

When does the budget reset? Budgets reset at the start of each calendar month. Previous months' usage does not carry over.