module pixeltable.functions.jina

Pixeltable UDFs that wrap Jina AI APIs for embeddings and reranking. In order to use them, the API key must be specified either with JINA_API_KEY environment variable, or as api_key in the jina section of the Pixeltable config file.

udf embeddings()

Signature

@pxt.udf
embeddings(
    input: pxt.String,
    *,
    model: pxt.String,
    task: pxt.String | None = None,
    dimensions: pxt.Int | None = None,
    late_chunking: pxt.Bool | None = None
) -> pxt.Array[(None,), float32]

Creates embedding vectors for the input text using Jina AI embedding models. Equivalent to the Jina AI embeddings API endpoint. For additional details, see: https://jina.ai/embeddings/ Request throttling: Applies the rate limit set in the config (section jina, key rate_limit). If no rate limit is configured, uses a default of 600 RPM. Parameters:

input (pxt.String): The text to embed.
model (pxt.String): The Jina embedding model to use. See available models at https://jina.ai/embeddings/.
task (pxt.String | None): Task-specific embedding optimization. Options:
- retrieval.query: For search queries
- retrieval.passage: For documents/passages to be searched
- separation: For clustering/separation tasks
- classification: For classification tasks
- text-matching: For semantic similarity
dimensions (pxt.Int | None): Output embedding dimensions (optional). If not specified, uses the model’s default dimension.
late_chunking (pxt.Bool | None): Enable late chunking for long documents.

Returns:

pxt.Array[(None,), float32]: An array representing the embedding of input.

Examples: Add a computed column that applies jina-embeddings-v3 to an existing column:

tbl.add_computed_column(
    embed=jina.embeddings(
        tbl.text, model='jina-embeddings-v3', task='retrieval.passage'
    )
)

Add an embedding index:

tbl.add_embedding_index(
    'text', string_embed=jina.embeddings.using(model='jina-embeddings-v3')
)

udf rerank()

Signature

@pxt.udf
rerank(
    query: pxt.String,
    documents: pxt.Json,
    *,
    model: pxt.String,
    top_n: pxt.Int | None = None,
    return_documents: pxt.Bool | None = None
) -> pxt.Json

Reranks documents based on their relevance to a query using Jina AI reranker models. Equivalent to the Jina AI rerank API endpoint. For additional details, see: https://jina.ai/reranker/ Request throttling: Applies the rate limit set in the config (section jina, key rate_limit). If no rate limit is configured, uses a default of 600 RPM. Parameters:

query (pxt.String): The query string to rank documents against.
documents (pxt.Json): The list of documents to rerank.
model (pxt.String): The Jina reranker model to use. See available models at https://jina.ai/reranker/.
top_n (pxt.Int | None): Number of top results to return. If not specified, returns all documents.
return_documents (pxt.Bool | None): Whether to include the original document text in results.

Returns:

pxt.Json: A dictionary containing:
- results: List of reranking results with index and relevance_score (and document if return_documents=True)
- usage: Token usage information

Examples: Rerank search results for better relevance:

tbl.add_computed_column(
    reranked=jina.rerank(
        tbl.query,
        tbl.candidate_docs,
        model='jina-reranker-v2-base-multilingual',
        top_n=5,
    )
)

SDK Reference

​module pixeltable.functions.jina

​udf embeddings()

​udf rerank()

module pixeltable.functions.jina

udf embeddings()

udf rerank()