ChunkProcessing: {
    ignore_headers_and_footers?: boolean;
    target_length?: number;
    tokenizer?: TokenizerType;
}

Controls the setting for the chunking and post-processing of each chunk.

Type declaration

  • Optionalignore_headers_and_footers?: boolean

    Whether to ignore headers and footers in the chunking process. This is recommended as headers and footers break reading order across pages.

  • Optionaltarget_length?: number

    The target number of words in each chunk. If 0, each chunk will contain a single segment.

  • Optionaltokenizer?: TokenizerType
MMNEPVFCICPMFPCPTTAAATR