hyrax.data_sets.data_set_registry

Attributes

logger

DATA_SET_REGISTRY

Classes

HyraxDataset

How to make a hyrax dataset:

HyraxImageDataset

This is a mixin for Image datasets primarily concerned with providing utility functions to

Functions

fetch_data_set_class(→ type[HyraxDataset])

Fetch the data loader class from the registry.

Module Contents

logger[source]
DATA_SET_REGISTRY: dict[str, type[HyraxDataset]][source]
class HyraxDataset(config: hyrax.config_utils.ConfigDict, metadata_table=None)[source]

How to make a hyrax dataset:

from hyrax.data_sets import HyraxDataset
from torch.utils.data import Dataset

class MyDataset(HyraxDataset, Dataset):
    def __init__(self, config: dict):
        super().__init__(config)

    def __getitem__():
        # Your getitem goes here
        pass

    def __len__ ():
        # Your len function goes here
        pass

Optional interfaces:

ids() -> Subclasses may override this directly with their own ids function returning a generator of strings

metadata -> Subclasses may pass an astropy table of metadata to __init__ in the superclass. This table of metadata will be available through the metadata_fields and metadata functions. If desired, a subclass may override these functions directly rather than using the astropy Table interface.

Further documentation is in the Getting started with Hyrax Custom Dataset Classes example notebook.

__init__()[source]

Overall initialization for all DataSets which saves the config

Subclasses of HyraxDataSet ought call this at the end of their __init__ like:

from hyrax.data_sets import HyraxDataset
from torch.utils.data import Dataset

class MyDataset(HyraxDataset, Dataset):
    def __init__(config):
        <your code>
        super().__init__(config)

If per tensor metadata is available, it is recommended that dataset authors create an astropy Table of that data, in the same order as their data and pass that metadata_table as shown below:

from hyrax.data_sets import HyraxDataset
from torch.utils.data import Dataset
from astropy.table import Table

class MyDataset(HyraxDataset, Dataset):
    def __init__(config):
        <your code>
        metadata_table = Table(<Your catalog data goes here>)
        super().__init__(config, metadata_table)
Parameters:
  • config (ConfigDict, Optional) – The runtime configuration for hyrax

  • metadata_table (Optional[Table], optional) – An Astropy Table with 1. the metadata columns desired for visualization AND 2. in the order your data will be enumerated.

_config[source]
_metadata_table = None[source]
tensorboardx_logger = None[source]
is_iterable()[source]

Returns true if underlying dataset is iterable style, supporting __iter__ vs map style where __getitem__/__len__ are the preferred access methods.

Returns:

True if underlying dataset is iterable

Return type:

bool

is_map()[source]

Returns true if underlying dataset is map style, supporting __getitem__/__len__ vs iterable where __iter__ is the preferred access method.

Returns:

True if underlying dataset is map-style

Return type:

bool

property config[source]
classmethod __init_subclass__()[source]
ids() collections.abc.Generator[str][source]

This is the default IDs function you get when you derive from hyrax Dataset

Returns:

A generator yielding all the string IDs of the dataset.

Return type:

Generator[str]

metadata_fields() list[str][source]

Returns a list of metadata fields supported by this object

Returns:

The column names of the metadata table passed. Empty string if no metadata was provided at during construction of the HyraxDataset (or derived class).

Return type:

list[str]

metadata(idxs: numpy.typing.ArrayLike, fields: list[str]) numpy.typing.ArrayLike[source]

Returns a table representing the metadata given an array of indexes and a list of fields.

Parameters:
  • idxs (npt.ArrayLike) – The indexes of the relevant tensor objects

  • fields (list[str]) – The names of the fields you would like returned. All values must be among those returned by metadata_fields()

Returns:

A numpy record array of your metadata, with only the columns specified. Roughly equivalent to: metadata_table[idxs][fields].as_array() where metadata_table is the astropy table that the HyraxDataset (or derived class) was constructed with.

Return type:

npt.ArrayLike

Raises:

RuntimeError – When none of the provided fields are

fetch_data_set_class(runtime_config: dict) type[HyraxDataset][source]

Fetch the data loader class from the registry.

Parameters:

runtime_config (dict) – The runtime configuration dictionary.

Returns:

The data loader class.

Return type:

type

Raises:
  • ValueError – If a built in data loader was requested, but not found in the registry.

  • ValueError – If no data loader was specified in the runtime configuration.

class HyraxImageDataset[source]

This is a mixin for Image datasets primarily concerned with providing utility functions to allow derived classes to set and apply transformations based on configs.

The various set_*_transform functions stack individual transformations on a single stack

The stack can be applied with apply_transform.

set_function_transform()[source]
set_crop_transform(cutout_shape=None)[source]
apply_transform(data_torch)[source]
_update_transform(new_transform)[source]
_get_np_function(transform_str: str) Callable[Ellipsis, Any][source]

_get_np_function. Returns the numpy mathematical function that the supplied string maps to; or raises an error if the supplied string cannot be mapped to a function.

Parameters:

transform_str (str) – The string to me mapped to a numpy function