How is the LLM API cost calculated?

It multiplies your input tokens by the model's input price per million and your output tokens by the output price per million, then sums them. Multiply by the requests field to project batch or monthly spend.

Are the prices current?

They are indicative and marked with an as-of date because provider pricing changes often. Always confirm the live rate on the OpenAI, Anthropic or Google pricing page, and use the override fields to plug in the exact numbers.

Can I count tokens from a real prompt instead of guessing?

Yes. Toggle Count from prompt and paste your text; the calculator tokenizes it with the selected model's tokenizer and uses that as the input token count. OpenAI counts are exact while Claude and Gemini are approximate.

Does this include cached input or batch pricing?

No. The estimate uses standard text-tier rates and excludes cached-input discounts, batch pricing, and image or audio tokens. Enter your discounted rate in the $/1M override to model those cases.

Are my prompts or API keys sent anywhere?

No. The calculator never asks for an API key, and token counting plus the cost math run locally in your browser, so any prompt you paste is processed on your device and is never uploaded.

LLM API Cost Calculator

Estimate GPT, Claude and Gemini API costs from your token counts. Your prompts never leave your device.

Token counting and cost math run locally in your browser and nothing is uploaded, but avoid pasting secrets or production data into the prompt box.

Need precise token counts? Open the Token Counter.

About LLM API Cost Calculator

This LLM cost calculator turns token counts into a dollar estimate for the OpenAI, Anthropic and Google APIs. Pick a model — GPT-4o, GPT-4.1, o1, Claude 3.5 Sonnet, Gemini 1.5 Pro and more — then enter input and output token counts, or paste a prompt and let the tool count its input tokens using the right tokenizer. It multiplies tokens by each model's per-million price and shows input, output and total cost, with an optional requests multiplier for batch or monthly projections. Prices are indicative and change often, so every figure is labelled with an as-of marker and you can override the input and output price per million for any model to match cached, batch or newer rates. Token counting and the math run entirely in your browser, so the prompts you paste are processed on your device and never leave it.

Features

Models grouped by provider: OpenAI, Anthropic Claude and Google Gemini
Enter input tokens directly, or count them from a pasted prompt for the selected model
Expected-output-token and per-request fields for batch or monthly estimates
Input, output and total cost shown in USD, plus cost per request
Custom $/1M override for input and output to match cached, batch or updated prices
Exact OpenAI token counts; Claude and Gemini counts labelled approximate
Indicative prices marked with an as-of date so you know to confirm them
All tokenizing and pricing math runs in your browser with no prompt upload

How to use the LLM API Cost Calculator

Choose a model from the provider-grouped dropdown.
Enter the input tokens, or toggle Count from prompt and paste your prompt.
Enter the expected output tokens and, if needed, a number of requests.
Read the input, output and total cost in USD.
Override the $/1M input or output price to match your actual rate.

Example

Input

Model: GPT-4o
Input tokens: 1,000
Output tokens: 500
Requests: 1

Output

Input cost:  $0.0025
Output cost: $0.0050
Total cost:  $0.0075

1,000 input @ $2.50/M + 500 output @ $10/M = $0.0075 per request.

Common errors & troubleshooting

The total does not match my provider invoice. — Prices are indicative and the estimate excludes cached-input discounts, batch pricing, image or audio tokens and request overhead. Override the $/1M fields with your real rate and confirm on the provider's pricing page.
The counted input tokens differ slightly from what the API reports for Claude or Gemini. — Claude and Gemini token counts are approximate in the browser. Use OpenAI models for exact counts, or treat Claude and Gemini totals as close estimates and verify against the provider.
My custom price has no effect. — The override only applies when its field is non-empty. Leave it blank to use the model's indicative price, or type a number such as 1.25 to override the per-million rate.
The cost shows as $0.00 for a tiny prompt. — Very small costs are shown with extra decimal places, but rounding can still read as $0.00. Increase the requests multiplier to see the projected cost at scale.

Frequently asked questions

How is the LLM API cost calculated?: It multiplies your input tokens by the model's input price per million and your output tokens by the output price per million, then sums them. Multiply by the requests field to project batch or monthly spend.
Are the prices current?: They are indicative and marked with an as-of date because provider pricing changes often. Always confirm the live rate on the OpenAI, Anthropic or Google pricing page, and use the override fields to plug in the exact numbers.
Can I count tokens from a real prompt instead of guessing?: Yes. Toggle Count from prompt and paste your text; the calculator tokenizes it with the selected model's tokenizer and uses that as the input token count. OpenAI counts are exact while Claude and Gemini are approximate.
Does this include cached input or batch pricing?: No. The estimate uses standard text-tier rates and excludes cached-input discounts, batch pricing, and image or audio tokens. Enter your discounted rate in the $/1M override to model those cases.
Are my prompts or API keys sent anywhere?: No. The calculator never asks for an API key, and token counting plus the cost math run locally in your browser, so any prompt you paste is processed on your device and is never uploaded.

Related tools

Token Counter — Count tokens for GPT, Claude and Gemini live — exact OpenAI counts plus Claude/Gemini estimates.
OpenAI API Tester — Build, run and copy OpenAI Chat Completions API requests as cURL, Python and JavaScript.
Anthropic Claude API Tester — Build, run and copy Anthropic Claude Messages API requests as cURL, Python and JavaScript.
Google Gemini API Tester — Build, run and copy Google Gemini generateContent API requests as cURL, Python and JavaScript.
JSON to TOON — Convert JSON to TOON (Token-Oriented Object Notation) and back, with an LLM token savings estimate.