Text Chunking for RAG: Sizes, Overlap and Strategies

What this tool does

The AI Text Chunker splits text into overlapping pieces ready to embed for retrieval-augmented generation (RAG). You control the unit (approximate tokens, characters, words or sentences), the chunk size and the overlap, and can export the result as a JSON array or JSONL for your embedding pipeline.

Why chunk at all

RAG works by embedding pieces of your documents into vectors, then at query time fetching the pieces whose vectors are closest to the question. The size of those pieces is a real lever: too large and a chunk's embedding averages several topics together and retrieves imprecisely; too small and it loses the context needed to be useful. Chunking is how you tune that trade-off.

Picking a unit

Tokens map most directly to what embedding models measure, so sizing in tokens keeps chunks within a model's limit. The count here is an estimate (~4 characters per token); use your model's tokenizer for exact figures.
Sentences avoid cutting a fact in half — often more important than hitting an exact size, because a chunk that ends mid-clause embeds poorly.
Characters / words are simple and predictable when your text is uniform.

Size and overlap

A few hundred tokens per chunk is a common starting point, with 10–20% overlap so context carries across the seam and a retrieved chunk is not missing the sentence that set it up. Denser reference text often prefers smaller chunks; flowing narrative, larger ones. There is no universal best — change one variable, re-run a couple of real queries, and compare what comes back. The token counter and cosine similarity tools help you measure both sides of that.

Exporting

Copy the chunks as a JSON array, as JSONL (one object per line), or download a .jsonl file. Each record carries an id, the chunk text and an estimated token count — the shape most embedding scripts expect. JSONL is the usual format for feeding an embedding or fine-tuning job; the JSONL converter can validate it line by line.

Privacy

All chunking happens in your browser. Your document is never uploaded, so you can safely chunk internal or confidential material before embedding it.

Ready to try it? Open the AI Text Chunker →