This documentation page is also available as an interactive notebook. You can launch the notebook in
Kaggle or Colab, or download it for use with an IDE or local Jupyter installation, by clicking one of the
above links.
Welcome to Pixeltable! In this tutorial, we’ll survey how to create
tables, populate them with data, and enhance them with built-in and
user-defined transformations and AI operations.
Install Python Packages
First run the following command to install Pixeltable and related
libraries needed for this tutorial.
%pip install -qU torch transformers openai pixeltable
Creating a Table
Let’s begin by creating a demo directory (if it doesn’t already exist)
and a table that can hold image data, demo.first. The table will
initially have just a single column to hold our input images, which
we’ll call input_image. We also need to specify a type for the column:
pxt.Image.
import pixeltable as pxt
# Create the directory `demo` (if it doesn't already exist)
pxt.drop_dir('demo', force=True) # First drop `demo` to ensure a clean environment
pxt.create_dir('demo')
# Create the table `demo.first` with a single column `input_image`
t = pxt.create_table('demo.first', {'input_image': pxt.Image})
Connected to Pixeltable database at: postgresql://postgres:@/pixeltable?host=/Users/asiegel/.pixeltable/pgdata
Created directory `demo`.
Created table `first`.
We can use t.describe() to examine the table schema. We see that it
now contains a single column, as expected.
The new table is initially empty, with no rows:
0
Now let’s put an image into it! We can add images simply by giving
Pixeltable their URLs. The example images in this demo come from the
COCO dataset, and we’ll be referencing
copies of them in the Pixeltable github repo. But in practice, the
images can come from anywhere: an S3 bucket, say, or the local file
system.
When we add the image, we see that Pixeltable gives us some useful
status updates indicating that the operation was successful.
t.insert([{'input_image': 'https://raw.githubusercontent.com/pixeltable/pixeltable/release/docs/resources/images/000000000025.jpg'}])
Inserting rows into `first`: 1 rows [00:00, 336.92 rows/s]
Inserted 1 row with 0 errors.
UpdateStatus(num_rows=1, num_computed_values=0, num_excs=0, updated_cols=[], cols_with_excs=[])
We can use t.show() to examine the contents of the table.
Adding Computed Columns
Great! Now we have a table containing some data. Let’s add an object
detection model to our workflow. Specifically, we’re going to use the
ResNet-50 object detection model, which runs using the Huggingface DETR
(“DEtection TRansformer”) model class. Pixeltable contains a built-in
adapter for this model family, so all we have to do is call the
detr_for_object_detection Pixeltable function. A nice thing about the
Huggingface models is that they run locally, so you don’t need an
account with a service provider in order to use them.
This is our first example of a computed column, a key concept in
Pixeltable. Recall that when we created the input_image column, we
specified a type, ImageType, indicating our intent to populate it with
data in the future. When we create a computed column, we instead
specify a function that operates on other columns of the table. By
default, when we add the new computed column, Pixeltable immediately
evaluates it against all existing data in the table - in this case, by
calling the detr_for_object_detection function on the image.
Depending on your setup, it may take a minute for the function to
execute. In the background, Pixeltable is downloading the model from
Huggingface (if necessary), instantiating it, and caching it for later
use.
from pixeltable.functions import huggingface
t.add_computed_column(detections=huggingface.detr_for_object_detection(
t.input_image, model_id='facebook/detr-resnet-50'
))
Computing cells: 100%|████████████████████████████████████████████| 1/1 [00:02<00:00, 2.03s/ cells]
Added 1 column value with 0 errors.
Let’s examine the results.
We see that the model returned a JSON structure containing a lot of
information. In particular, it has the following fields:
label_text: Descriptions of the objects detected
boxes: Bounding boxes for each detected object
scores: Confidence scores for each detection
labels: The DETR model’s internal IDs for the detected objects
Perhaps this is more than we need, and all we really want are the text
labels. We could add another computed column to extract label_text
from the JSON struct:
t.add_computed_column(detections_text=t.detections.label_text)
t.show()
Computing cells: 100%|███████████████████████████████████████████| 1/1 [00:00<00:00, 281.61 cells/s]
Added 1 column value with 0 errors.
If we inspect the table schema now, we see how Pixeltable distinguishes
between ordinary and computed columns.
Now let’s add some more images to our table. This demonstrates another
important feature of computed columns: by default, they update
incrementally any time new data shows up on their inputs. In this case,
Pixeltable will run the ResNet-50 model against each new image that is
added, then extract the labels into the detect_text column. Pixeltable
will orchestrate the execution of any sequence (or DAG) of computed
columns.
Note how we can pass multiple rows to t.insert with a single
statement, which will insert them more efficiently.
more_images = [
'https://raw.githubusercontent.com/pixeltable/pixeltable/release/docs/resources/images/000000000030.jpg',
'https://raw.githubusercontent.com/pixeltable/pixeltable/release/docs/resources/images/000000000034.jpg',
'https://raw.githubusercontent.com/pixeltable/pixeltable/release/docs/resources/images/000000000042.jpg',
'https://raw.githubusercontent.com/pixeltable/pixeltable/release/docs/resources/images/000000000061.jpg'
]
t.insert({'input_image': image} for image in more_images)
Computing cells: 50%|██████████████████████ | 4/8 [00:01<00:01, 3.67 cells/s]
Inserting rows into `first`: 4 rows [00:00, 3478.59 rows/s]
Computing cells: 100%|████████████████████████████████████████████| 8/8 [00:01<00:00, 7.32 cells/s]
Inserted 4 rows with 0 errors.
UpdateStatus(num_rows=4, num_computed_values=8, num_excs=0, updated_cols=[], cols_with_excs=[])
Let’s see what the model came up with. We’ll use t.select to suppress
the display of the detect column, since right now we’re only
interested in the text labels.
t.select(t.input_image, t.detections_text).show()
Pixeltable Is Persistent
An important feature of Pixeltable is that everything is persistent.
Unlike in-memory Python libraries such as Pandas, Pixeltable is a
database: all your data, transformations, and computed columns are
stored and preserved between sessions. To see this, let’s clear all the
variables in our notebook and start fresh. You can optionally restart
your notebook kernel at this point, to demonstrate how Pixeltable data
persists across sessions.
# Clear all variables in the notebook
%reset -f
# Instantiate a new client object
import pixeltable as pxt
t = pxt.get_table('demo.first')
# Display just the first two rows, to avoid cluttering the tutorial
t.select(t.input_image, t.detections_text).show(2)
GPT-4o
For comparison, let’s try running our examples through a generative
model, Open AI’s gpt-4o-mini. For this section, you’ll need an OpenAI
account with an API key. You can use the following command to add your
API key to the environment (just enter your API key when prompted):
import os
import getpass
if 'OPENAI_API_KEY' not in os.environ:
os.environ['OPENAI_API_KEY'] = getpass.getpass('Enter your OpenAI API key:')
Enter your OpenAI API key: ········
Now we can connect to OpenAI through Pixeltable. This may take some
time, depending on how long OpenAI takes to process the query.
from pixeltable.functions import openai
t.add_computed_column(vision=openai.vision(
prompt="Describe what's in this image.",
image=t.input_image,
model='gpt-4o-mini'
))
Computing cells: 100%|████████████████████████████████████████████| 5/5 [00:28<00:00, 5.64s/ cells]
Computing cells: 100%|███████████████████████████████████████████| 5/5 [00:00<00:00, 647.69 cells/s]
Added 5 column values with 0 errors.Added 5 column values with 0 errors.
Let’s see how GPT-4’s responses compare to the traditional
discriminative (DETR) model.
t.select(t.input_image, t.detections_text, t.vision).show()
In addition to adapters for local models and inference APIs, Pixeltable
can perform a range of more basic image operations. These image
operations can be seamlessly chained with API calls, and Pixeltable will
keep track of the sequence of operations, constructing new images and
caching when necessary to keep things running smoothly. Just for fun
(and to demonstrate the power of computed columns), let’s see what
OpenAI thinks of our sample images when we rotate them by 180 degrees.
t.add_computed_column(rot_image=t.input_image.rotate(180))
t.add_computed_column(rot_vision=openai.vision(
prompt="Describe what's in this image.",
image=t.rot_image,
model='gpt-4o-mini'
))
Added 5 column values with 0 errors.
Computing cells: 100%|████████████████████████████████████████████| 5/5 [00:26<00:00, 5.24s/ cells]
Computing cells: 100%|███████████████████████████████████████████| 5/5 [00:00<00:00, 661.02 cells/s]
Added 5 column values with 0 errors.
t.select(t.rot_image, t.rot_vision).show()
UDFs: Enhancing Pixeltable’s Capabilities
Another important principle of Pixeltable is that, although Pixeltable
has a built-in library of useful operations and adapters, it will never
prescribe a particular way of doing things. Pixeltable is built from the
ground up to be extensible.
Let’s take a specific example. Recall our use of the ResNet-50 detection
model, in which the detect column contains a JSON blob with bounding
boxes, scores, and labels. Suppose we want to create a column containing
the single label with the highest confidence score. There’s no built-in
Pixeltable function to do this, but it’s easy to write our own. In fact,
all we have to do is write a Python function that does the thing we
want, and mark it with the @pxt.udf decorator.
@pxt.udf
def top_detection(detect: dict) -> str:
scores = detect['scores']
label_text = detect['label_text']
# Get the index of the object with the highest confidence
i = scores.index(max(scores))
# Return the corresponding label
return label_text[i]
t.add_computed_column(top=top_detection(t.detections))
Computing cells: 100%|███████████████████████████████████████████| 5/5 [00:00<00:00, 495.50 cells/s]
Computing cells: 100%|██████████████████████████████████████████| 5/5 [00:00<00:00, 1096.21 cells/s]
Added 5 column values with 0 errors.Added 5 column values with 0 errors.
t.select(t.detections_text, t.top).show()
Congratulations! You’ve reached the end of the tutorial. Hopefully, this
gives a good overview of the capabilities of Pixeltable, but there’s
much more to explore. As a next step, you might check out one of the
other tutorials, depending on your interests: