Embedding indexes let you search your data based on meaning, not just keywords. They work with all kinds of content - text, images, audio, video, and documents - making it easy to build powerful search systems.
The query phase allows you to search your indexed content using the similarity() function.
Similarity Search
Use in a Computed Column
Copy
Ask AI
sim = docs.content.similarity("what is the documentation") # Return top-k most similar documents results = (docs.order_by(sim, asc=False) .select(docs.content, docs.metadata, score=sim) .limit(10) ) for i in results: print(f"Similarity: {i['score']:.3f}") print(f"Text: {i['content']}\n")
Pixeltable allows direct access to the raw embedding vectors through the .embedding() method. This feature lets you retrieve the actual vector representations that power similarity search.
Copy
Ask AI
# Access embeddings from a column with a single indexresults = docs.select( docs.content, embedding=docs.content.embedding()).limit(5)# Access embeddings from a column with multiple indicesresults = docs.select( docs.content, embedding=docs.content.embedding(idx='custom_idx_name')).limit(5)# Embeddings are returned as numpy arraysimport numpy as npassert isinstance(results[0, 'embedding'], np.ndarray)# You can also store embeddings in a computed columndocs.add_computed_column( embedding_copy=docs.content.embedding())
The .similarity() method cannot be used directly in computed columns
Embedding indices cannot be dropped if there are computed columns that depend on them
When a column has multiple embedding indices, you must specify which index to use with the idx parameter