chunking.pdf_structural¶
chunktuner.chunking.pdf_structural
¶
Layout-inspired splits for PDF-derived markdown (## Page N markers optional).
PdfStructuralStrategy
¶
Page or section headings define regions; long regions are split by character cap.
Source code in src/chunktuner/chunking/pdf_structural.py
chunk
¶
Emit one chunk per sub-region up to max_region_chars within each structural span.