Convert Hyrax Results to a Pandas DataFrame#
After running inference with Hyrax, results are saved in the Lance format — a columnar format optimized for ML workloads.
This notebook shows two ways to load those results into a familiar Pandas DataFrame:
LanceDB — provides a SQL-like query interface for easy filtering and selection.
PyLance — provides direct, low-level access to the data on disk.
Note: Neither approach requires Hyrax to be installed. Only
lancedborlanceis needed.
The example below uses results produced by the Getting Started notebook.
Setup#
Set result_directory to the path of your Hyrax output directory. Here we use the location of the saved predictions from the Getting Started notebook.
[1]:
from pathlib import Path
result_directory = Path("./example_results/getting_started_results")
Option 1: LanceDB#
Connect to the results directory using lancedb, open the "results" table, and convert it to a Pandas DataFrame.
[2]:
import lancedb
lance_dir = result_directory / "lance_db"
db = lancedb.connect(str(lance_dir))
table = db.open_table("results")
df = table.to_pandas()
df.head()
[2]:
| object_id | data | |
|---|---|---|
| 0 | 00000 | [0.096435286, -2.6353374, 1.7344711, 2.0339143... |
| 1 | 00001 | [4.793885, 6.458918, 0.20510733, -2.3948255, -... |
| 2 | 00002 | [2.7748845, 3.6781337, 0.251015, -0.95724285, ... |
| 3 | 00003 | [3.8944254, 1.8255252, 0.85703826, -1.0406122,... |
| 4 | 00004 | [-2.8371797, -2.5587287, 2.6390426, 2.211744, ... |
Option 2: Lance#
The lance library gives you direct access to the dataset file on disk. Note that the path points to the specific .lance dataset file inside lance_db/, rather than the lance_db/ directory itself.
[3]:
import lance
lance_dir = result_directory / "lance_db" / "results.lance"
ds = lance.dataset(lance_dir)
table = ds.to_table()
df = table.to_pandas()
df.head()
[3]:
| object_id | data | |
|---|---|---|
| 0 | 00000 | [0.096435286, -2.6353374, 1.7344711, 2.0339143... |
| 1 | 00001 | [4.793885, 6.458918, 0.20510733, -2.3948255, -... |
| 2 | 00002 | [2.7748845, 3.6781337, 0.251015, -0.95724285, ... |
| 3 | 00003 | [3.8944254, 1.8255252, 0.85703826, -1.0406122,... |
| 4 | 00004 | [-2.8371797, -2.5587287, 2.6390426, 2.211744, ... |
Both approaches produce a standard Pandas DataFrame, making it easy to share results and explore them without any dependency on Hyrax — only lancedb or lance is required.