Controls the processing and generation for the segment.
crop_image controls whether to crop the file's images to the segment's bounding box.
The cropped image will be stored in the segment's image field. Use All to always crop,
or Auto to only crop when needed for post-processing.
html is the HTML output for the segment, generated either through huerstics (Auto) or using Chunkr fine-tuned models (LLM)
llm is the LLM-generated output for the segment, this uses off-the-shelf models to generate a custom output for the segment
markdown is the Markdown output for the segment, generated either through huerstics (Auto) or using Chunkr fine-tuned models (LLM)
embed_sources defines which content sources will be included in the chunk's embed field and counted towards the chunk length.
The array's order determines the sequence in which content appears in the embed field (e.g., [Markdown, LLM] means Markdown content
is followed by LLM content). This directly affects what content is available for embedding and retrieval.
Controls the processing and generation for the segment.
crop_imagecontrols whether to crop the file's images to the segment's bounding box. The cropped image will be stored in the segment'simagefield. UseAllto always crop, orAutoto only crop when needed for post-processing.htmlis the HTML output for the segment, generated either through huerstics (Auto) or using Chunkr fine-tuned models (LLM)llmis the LLM-generated output for the segment, this uses off-the-shelf models to generate a custom output for the segmentmarkdownis the Markdown output for the segment, generated either through huerstics (Auto) or using Chunkr fine-tuned models (LLM)embed_sourcesdefines which content sources will be included in the chunk's embed field and counted towards the chunk length. The array's order determines the sequence in which content appears in the embed field (e.g., [Markdown, LLM] means Markdown content is followed by LLM content). This directly affects what content is available for embedding and retrieval.