chunking¶
chunktuner.chunking
¶
FixedTokenStrategy
¶
Sliding tiktoken windows with optional overlap (baseline RAG chunker).
Source code in src/chunktuner/chunking/fixed_tokens.py
chunk
¶
Split doc.content into fixed-size token spans with correct char offsets.
Source code in src/chunktuner/chunking/fixed_tokens.py
RecursiveCharacterStrategy
¶
Hierarchical character splits (paragraphs, lines, sentences) with overlap.
Source code in src/chunktuner/chunking/recursive_character.py
chunk
¶
Produce overlapping spans by splitting on separators up to chunk_size_chars.