What is a token in an LLM?

A token is a chunk of text — often a word fragment — produced by the model's tokenizer. Models read and bill per token, not per character or word. In English, one token is roughly four characters or about three-quarters of a word.

How accurate is the token estimate?

It is an estimate. Each model family uses its own byte-pair-encoding vocabulary, so exact counts differ. The blended characters/4 and words×1.33 rule is typically within 10–20% for English prose; code and non-English text vary more. For exact billing, use the provider's tokenizer.

How do I estimate the cost of an API call?

Multiply input tokens by the input price per million tokens, add output tokens times the output price per million, then multiply by the number of calls. The tool does this for you when you enter prices and expected output length.

LLM Tokens Explained: Counting & Estimating Cost

Q: Is my text uploaded?

No. Counting and cost estimation run entirely in your browser. Nothing you paste is sent anywhere.

What is a token?

Large language models don't read characters or whole words — they read tokens, chunks of text produced by a tokenizer. Common words are often a single token, while rarer or longer words are split into several subword pieces. Spaces and punctuation count too. Models process and bill per token, so tokens are the unit that matters for both context limits and cost.

Roughly how many tokens?

For typical English text, a useful rule of thumb is that one token is about four characters, or about 0.75 words. So 1,000 words is roughly 1,300 tokens. The Token Counter blends both rules for a quick estimate and shows you the live characters-per-token ratio.

Why it's only an estimate

Each model family uses its own byte-pair-encoding (BPE) vocabulary, learned from data, so the exact split differs between models. Code, JSON, non-English scripts, emoji and unusual symbols tokenize less predictably than plain prose — they often use more tokens per character. Treat the number as a close approximation (usually within 10–20% for English), and use your provider's official tokenizer when you need exact billing.

Estimating cost

API pricing is quoted per million tokens, usually with a higher rate for output (generated) tokens than input (prompt) tokens. The cost of a call is:

cost = input_tokens / 1e6 × input_price
     + output_tokens / 1e6 × output_price

Enter the input and output prices, the expected output length, and how many calls you plan to make, and the tool totals it for you. The preset buttons fill in example price tiers — always confirm the current rates with your provider, since pricing changes.

Tips to use fewer tokens

Trim boilerplate and repeated context; send only what the model needs.
Cap output length when you only need a short answer.
Summarize long histories instead of resending the full transcript each turn.
Prefer compact formats; verbose JSON and heavy indentation add tokens.

Privacy

Prompts can contain sensitive material, so a server-side counter is a risk. Here, counting and cost estimation run entirely in your browser — nothing you paste is uploaded.

FAQ

What is a token?

A chunk of text from the model's tokenizer — roughly four characters or 0.75 words in English.

How accurate is the estimate?

Usually within 10–20% for English prose; code and other languages vary more.

Is my text uploaded?

No — everything runs locally in your browser.

Ready to try it? Open the Token Counter →

LLM tokens explained: counting and estimating cost

What is a token?

Roughly how many tokens?

Why it's only an estimate

Estimating cost

Tips to use fewer tokens

Privacy

FAQ

What is a token?

How accurate is the estimate?

Is my text uploaded?

Related guides