Skip to content

LLM tokens explained: counting and estimating cost

What tokens are, why language models split text into subword pieces, how to estimate a token count, and how to turn that into an API cost for input and output.

Open the Token Counter →

What is a token?

Large language models don't read characters or whole words — they read tokens, chunks of text produced by a tokenizer. Common words are often a single token, while rarer or longer words are split into several subword pieces. Spaces and punctuation count too. Models process and bill per token, so tokens are the unit that matters for both context limits and cost.

Roughly how many tokens?

For typical English text, a useful rule of thumb is that one token is about four characters, or about 0.75 words. So 1,000 words is roughly 1,300 tokens. The Token Counter blends both rules for a quick estimate and shows you the live characters-per-token ratio.

Why it's only an estimate

Each model family uses its own byte-pair-encoding (BPE) vocabulary, learned from data, so the exact split differs between models. Code, JSON, non-English scripts, emoji and unusual symbols tokenize less predictably than plain prose — they often use more tokens per character. Treat the number as a close approximation (usually within 10–20% for English), and use your provider's official tokenizer when you need exact billing.

Estimating cost

API pricing is quoted per million tokens, usually with a higher rate for output (generated) tokens than input (prompt) tokens. The cost of a call is:

cost = input_tokens / 1e6 × input_price
     + output_tokens / 1e6 × output_price

Enter the input and output prices, the expected output length, and how many calls you plan to make, and the tool totals it for you. The preset buttons fill in example price tiers — always confirm the current rates with your provider, since pricing changes.

Tips to use fewer tokens

  • Trim boilerplate and repeated context; send only what the model needs.
  • Cap output length when you only need a short answer.
  • Summarize long histories instead of resending the full transcript each turn.
  • Prefer compact formats; verbose JSON and heavy indentation add tokens.

Privacy

Prompts can contain sensitive material, so a server-side counter is a risk. Here, counting and cost estimation run entirely in your browser — nothing you paste is uploaded.

FAQ

What is a token?

A chunk of text from the model's tokenizer — roughly four characters or 0.75 words in English.

How accurate is the estimate?

Usually within 10–20% for English prose; code and other languages vary more.

Is my text uploaded?

No — everything runs locally in your browser.

Ready to try it? Open the Token Counter →

Related guides