Troubleshooting¶
No module named … for optional strategies or MCP¶
Install the matching optional extra (see pyproject.toml [project.optional-dependencies]):
| Import / error signal | Install |
|---|---|
semchunk (semantic / markdown strategies) |
uv add "chunktuner[semantic]" or pip install 'chunktuner[semantic]' |
docling (PDF/DOCX/PPTX ingestion in FileIngestor) |
uv add "chunktuner[docling]" |
ragas / datasets when enabling RAGAS metrics |
uv add "chunktuner[ragas]" |
tree_sitter / grammar packages for code_ast |
uv add "chunktuner[code]" |
mcp when running chunk-tune-mcp |
uv add "chunktuner[mcp]" or uv sync --extra mcp |
Authentication errors (embeddings)¶
LiteLLM reads provider keys from standard environment variables (see LiteLLM providers). Examples: OPENAI_API_KEY for OpenAI. Pass --embedding-model on the CLI only when you intend to call a remote model, and confirm with --yes where the command requires it (recommend, evaluate).
MCP server not visible in your host¶
- Run
uvx --from "chunktuner[mcp]" chunk-tune-mcp(oruv run chunk-tune-mcpfrom a dev checkout with themcpextra) and confirm it stays running without tracebacks. - Set
CHUNK_TUNER_BASE_DIRto an absolute path to an existing directory that contains (or is a parent of) the files you pass to tools. - Restart the MCP host after editing its config.
- MCP logging is configured on stderr in
chunktuner.mcp.server.runatINFOby default — check the host’s server logs, not stdout (stdout is reserved for JSON-RPC).
Path errors in MCP (path outside base dir or similar)¶
require_under_base rejects paths that do not resolve under CHUNK_TUNER_BASE_DIR. Use paths inside that tree, and set the base to a directory that is an ancestor of your corpus files.
Low retrieval metric scores¶
- Try different
max_tokens/ overlap settings onfixed_tokens, or other strategies (chunk-tune analyzesuggests a starting heuristic). - Inspect
avg_chunk_lengthand duplication in the evaluation output — pathological splits can hurt recall.
Offset invariant errors (chunk.text does not match doc.content)¶
Strategies must satisfy doc.content[chunk.start_offset:chunk.end_offset] == chunk.text. If you see a violation, file an issue with the strategy name, parameters, and document type. Offset tests live in tests/unit/test_chunking_offsets.py.