Jina Chunking

Blocks for splitting text into semantic chunks using Jina AI.

Jina Chunking

What it is

Chunks texts using Jina AI's segmentation service

How it works

This block uses Jina AI's segmentation service to split texts into semantically meaningful chunks. Unlike simple splitting by character count, Jina's chunking preserves semantic coherence, making it ideal for RAG applications.

Configure maximum chunk length and optionally return token information for each chunk.

Inputs

Input
Description
Type
Required

texts

List of texts to chunk

List[Any]

Yes

max_chunk_length

Maximum length of each chunk

int

No

return_tokens

Whether to return token information

bool

No

Outputs

Output
Description
Type

error

Error message if the operation failed

str

chunks

List of chunked texts

List[Any]

tokens

List of token information for each chunk

List[Any]

Possible use case

RAG Preprocessing: Chunk documents for retrieval-augmented generation systems.

Embedding Preparation: Split long texts into optimal chunks for embedding generation.

Document Processing: Break down large documents for analysis or storage in vector databases.


Last updated

Was this helpful?