module pixeltable.functions.jina
Pixeltable UDFs that wrap Jina AI APIs for embeddings and reranking. In order to use them, the API key must be specified either withJINA_API_KEY
environment variable, or as api_key in the jina section of the Pixeltable config file.
udf embeddings()
Signature
jina, key rate_limit). If no rate
limit is configured, uses a default of 600 RPM.
Parameters:
input(pxt.String): The text to embed.model(pxt.String): The Jina embedding model to use. See available models at https://jina.ai/embeddings/.task(pxt.String | None): Task-specific embedding optimization. Options:retrieval.query: For search queriesretrieval.passage: For documents/passages to be searchedseparation: For clustering/separation tasksclassification: For classification taskstext-matching: For semantic similarity
dimensions(pxt.Int | None): Output embedding dimensions (optional). If not specified, uses the model’s default dimension.late_chunking(pxt.Bool | None): Enable late chunking for long documents.
pxt.Array[(None,), float32]: An array representing the embedding ofinput.
udf rerank()
Signature
jina, key rate_limit). If no rate
limit is configured, uses a default of 600 RPM.
Parameters:
query(pxt.String): The query string to rank documents against.documents(pxt.Json): The list of documents to rerank.model(pxt.String): The Jina reranker model to use. See available models at https://jina.ai/reranker/.top_n(pxt.Int | None): Number of top results to return. If not specified, returns all documents.return_documents(pxt.Bool | None): Whether to include the original document text in results.
-
pxt.Json: A dictionary containing:results: List of reranking results withindexandrelevance_score(anddocumentifreturn_documents=True)usage: Token usage information