chunking.structural_semantic¶
chunktuner.chunking.structural_semantic
¶
Structural regions then token-sized windows (pdf/docx/pptx markdown).
StructuralSemanticStrategy
¶
Coarse PdfStructuralStrategy regions refined with fixed-token sub-windows.
Source code in src/chunktuner/chunking/structural_semantic.py
chunk
¶
Map structural regions to absolute-offset sub-chunks using FixedTokenStrategy.