Guide

Markdown Chunking Examples

Chunking for RAG is not just about hitting a token number. The useful question is whether each chunk keeps enough structure, heading context, and source shape to make retrieval results trustworthy.

Inspect Markdown chunks Read the RAG guide

Compare four chunking presets on the same sourceProtect headings, tables, and fenced code blocksChoose export shapes for human QA or JSONL ingestion

Source sample

Use the same Markdown source to compare different chunking choices.

The examples below use a small API-docs source because it includes the shapes that often break during ingestion: headings, a Markdown table, a fenced code block, and a short list of errors.

Example source Markdown

# Widget API

Widget API lets teams create and update customer widgets.

## Authentication

Use a server-side API key for every request.

## Rate limits

| Plan | Requests per minute |
| --- | ---: |
| Starter | 120 |
| Pro | 600 |

## Example request

```bash
curl -H "Authorization: Bearer $API_KEY" https://api.example.com/widgets
```

## Common errors

- 401 when the API key is missing
- 429 when the request limit is exceeded

Preset comparison

Different chunking presets protect different parts of the source.

The right preset depends on what would make a retrieved answer fail: missing headings, oversized chunks, broken tables, split code, or context-free fragments.

Heading-based chunks

Best for: Docs, guides, and knowledge-base pages with useful section headings.

Example output:

Chunk 1: # Widget API
Includes the title, intro, and Authentication section.
Heading path: Widget API > Authentication

Chunk 2: ## Rate limits
Keeps the rate-limit table attached to its heading.
Heading path: Widget API > Rate limits

Chunk 3: ## Example request
Keeps the code sample and Common errors section reviewable.

Watch for: Weak headings produce weak chunks. If a page has no useful headings, this preset may need manual cleanup first.

Token-window chunks

Best for: Long pages where roughly even chunk size matters more than exact document sections.

Example output:

Chunk 1: target 180 tokens
Widget API intro, Authentication, and part of Rate limits.

Chunk 2: target 180 tokens
Remaining Rate limits, Example request, and Common errors.

Watch for: Token windows can split tables or code if the source is dense. Review boundaries before exporting to JSONL.

Paragraph-safe chunks

Best for: Articles, explainers, and source notes where paragraph meaning matters more than exact token packing.

Example output:

Chunk 1:
Title, intro, and Authentication paragraph.

Chunk 2:
Rate limits heading plus the complete table.

Chunk 3:
Example request code block plus Common errors list.

Watch for: Paragraph-safe chunks can vary in size. Use token estimates to catch sections that grow too large.

Code/table-safe chunks

Best for: Developer docs where broken tables or split fenced code blocks can damage retrieval quality.

Example output:

Chunk 1:
Widget API intro and Authentication.

Chunk 2:
Rate limits heading with the full Markdown table intact.

Chunk 3:
Example request heading with the full fenced bash block intact.

Chunk 4:
Common errors list.

Watch for: This preset may create more chunks, but it protects the source shapes that are most painful to reconstruct later.

QA checklist

Review chunk quality before the source becomes embeddings.

Once a bad chunk is embedded, the failure gets harder to see. A short inspection pass before ingestion can save a lot of debugging later.

Can the chunk stand alone?

A retrieval hit should include enough heading context that the model can understand what the passage is about.

Did tables and code survive?

If a table row or fenced code block is split in half, the retrieved answer can become harder to trust.

Are tiny chunks useful?

A tiny chunk can be fine for a definition, but too many tiny fragments usually create noisy retrieval results.

Is overlap doing real work?

Overlap should preserve context between neighboring chunks. It should not just duplicate large blocks everywhere.

Export choice

The export format should match the next job.

PromptStage keeps these separate because the human review path and the ingestion path need different shapes. Use the same chunk inspection result, then export it in the format your next step expects.

Markdown export

Use this for human review, source QA, or sharing chunk boundaries with a teammate before you automate ingestion.

JSON export

Use this when another script or app needs structured chunk fields, warnings, heading paths, and token estimates.

JSONL export

Use this when each line should become one retrieval, embedding, or agent-context record.

Workflow paths

Clean the source, inspect the chunks, then move into your retrieval stack.

If the source started as a web page, use HTML to Markdown for AI first. If the source is already cleaned Markdown, open the chunk inspector directly and compare presets before exporting JSON or JSONL.

Clean a web page first Read ingestion guide View cleanup examples

FAQ

Chunking decisions should be tested against real retrieval questions.

What is the best chunk size for RAG?

There is no universal size. Start with chunks that are large enough to preserve meaning and small enough to retrieve precisely, then test against real questions.

Should Markdown be chunked by headings?

Often yes, if the headings are meaningful. Heading paths make retrieved chunks easier to inspect and explain.

Should I chunk before or after cleaning HTML?

Clean first. Chunking raw page chrome can preserve navigation, cookie banners, and repeated boilerplate as retrieval content.