hyrax.datasets.nested_pandas_dataset

hyrax.datasets.nested_pandas_dataset#

Classes#

NestedPandasDataset

A minimal Hyrax wrapper around nested_pandas.read_parquet.

Module Contents#

class NestedPandasDataset(config: dict, data_location: pathlib.Path | str | None = None)[source]#

Bases: hyrax.datasets.dataset_registry.HyraxDataset

A minimal Hyrax wrapper around nested_pandas.read_parquet.

__init__()[source]#

Overall initialization for all Datasets which saves the config

Subclasses of HyraxDataset ought call this at the end of their __init__ like:

from hyrax.datasets import HyraxDataset

class MyDataset(HyraxDataset):
    def __init__(config):
        <your code>
        super().__init__(config)

If per tensor metadata is available, it is recommended that dataset authors create an astropy Table of that data, in the same order as their data and pass that metadata_table as shown below:

from hyrax.datasets import HyraxDataset
from astropy.table import Table

class MyDataset(HyraxDataset):
    def __init__(config):
        <your code>
        metadata_table = Table(<Your catalog data goes here>)
        super().__init__(config, metadata_table)

Parameters:

config (dict, Optional) – The runtime configuration for hyrax
metadata_table (Optional[Table], optional) – An Astropy Table with 1. the metadata columns desired for visualization AND 2. in the order your data will be enumerated.
object_id_column_name (Optional[str], optional) – The name of the column containing object IDs. If None, uses the default from config or creates one from the ids() method.

data_location = ''[source]#

read_kwargs[source]#

nested_frame[source]#

_load_nested_frame(read_kwargs: dict)[source]#

_all_available_fields() → list[str][source]#

_register_getters() → None[source]#

__len__() → int[source]#