Skip to content

← All tools

AI Text Chunker

Split a document into overlapping chunks for retrieval-augmented generation and embeddings. Choose to split by approximate tokens, characters, words or whole sentences, set the size and overlap, preview the result, and export as JSON or JSONL. Everything runs in your browser.

New to this? Read the text chunker guide →

Token counts are an estimate (~4 characters per token); use your model's tokenizer for exact billing. Splitting by sentences keeps ideas intact and ignores the size unit's token/char meaning, using it as a sentence count.

Everything runs locally in your browser — your text is never uploaded.

How to use the text chunker

  1. Paste your document, then choose how to split it — by approximate tokens, characters, words or whole sentences.
  2. Set the chunk size and how much each chunk should overlap the previous one. Overlap preserves context across boundaries so a retrieved chunk is not cut mid-thought.
  3. Review the chunks, then copy them as a JSON array or JSONL, or download a .jsonl file ready for embedding.

Why chunking matters for RAG

Retrieval-augmented generation works by embedding pieces of your documents and fetching the most relevant ones at query time. Chunks that are too large dilute the embedding and waste context; too small and they lose the surrounding meaning. A few hundred tokens with a small overlap is a common starting point. Splitting on sentence boundaries avoids cutting a fact in half, which matters more than hitting an exact size.

Choosing size and overlap

There is no universal best; it depends on your content and embedding model's context window. Denser, reference-style text often does better in smaller chunks; narrative text in larger ones. Overlap of 10–20% of the chunk size is typical — enough to carry context across the seam without duplicating too much. Measure a query or two before and after a change rather than guessing. The token counter helps you size chunks to a model's limits.

Frequently asked questions

What is text chunking for?

Retrieval-augmented generation (RAG) embeds pieces of your documents and fetches the most relevant ones at query time. Chunking splits a document into those pieces. Chunks that are too large retrieve imprecisely; too small and they lose context — so size and overlap are worth tuning.

What chunk size and overlap should I use?

A few hundred tokens per chunk with 10–20% overlap is a common starting point, but the best values depend on your content and embedding model. Denser reference text often prefers smaller chunks. Change one variable, re-run a couple of real queries, and compare what comes back.

Are the token counts exact?

No — they are estimated at roughly 4 characters per token. Use the tokenizer for your model for exact billing. The estimate is fine for sizing chunks to a context window.

Is my text uploaded?

No. All chunking runs in your browser, so you can safely split internal or confidential documents before embedding them.

Related tools