hyrax.datasets.dataset_registry#

Attributes#

Classes#

HyraxDataset

How to make a hyrax dataset:

HyraxImageDataset

This is a mixin for Image datasets primarily concerned with providing utility functions to

Functions#

fetch_dataset_class(→ type[HyraxDataset])

Fetch the dataset class from the registry.

Module Contents#

logger[source]#
DATASET_REGISTRY: dict[str, type[HyraxDataset]][source]#
class HyraxDataset(config: dict, metadata_table=None, object_id_column_name=None)[source]#

How to make a hyrax dataset:

from hyrax.datasets import HyraxDataset

class MyDataset(HyraxDataset):
    def __init__(self, config: dict):
        super().__init__(config)

    def __len__(self):
        # Your len function goes here
        pass

Optional interfaces:

metadata -> Subclasses may pass an astropy table of metadata to __init__ in the superclass. This table of metadata will be available through the metadata_fields and metadata functions. If desired, a subclass may override these functions directly rather than using the astropy Table interface.

Further documentation is in the Build a dataset class in a notebook example notebook.

__init__()[source]#

Overall initialization for all Datasets which saves the config

Subclasses of HyraxDataset ought call this at the end of their __init__ like:

from hyrax.datasets import HyraxDataset

class MyDataset(HyraxDataset):
    def __init__(config):
        <your code>
        super().__init__(config)

If per tensor metadata is available, it is recommended that dataset authors create an astropy Table of that data, in the same order as their data and pass that metadata_table as shown below:

from hyrax.datasets import HyraxDataset
from astropy.table import Table

class MyDataset(HyraxDataset):
    def __init__(config):
        <your code>
        metadata_table = Table(<Your catalog data goes here>)
        super().__init__(config, metadata_table)
Parameters:
  • config (dict, Optional) – The runtime configuration for hyrax

  • metadata_table (Optional[Table], optional) – An Astropy Table with 1. the metadata columns desired for visualization AND 2. in the order your data will be enumerated.

  • object_id_column_name (Optional[str], optional) – The name of the column containing object IDs. If None, uses the default from config or creates one from the ids() method.

_config[source]#
_metadata_table = None[source]#
property config[source]#
classmethod __init_subclass__()[source]#
metadata_fields() list[str][source]#

Returns a list of metadata fields supported by this object

Returns:

The column names of the metadata table passed. Empty string if no metadata was provided at during construction of the HyraxDataset (or derived class).

Return type:

list[str]

metadata(idxs: numpy.typing.ArrayLike, fields: list[str]) numpy.typing.ArrayLike[source]#

Returns a table representing the metadata given an array of indexes and a list of fields.

Parameters:
  • idxs (npt.ArrayLike) – The indexes of the relevant tensor objects

  • fields (list[str]) – The names of the fields you would like returned. All values must be among those returned by metadata_fields()

Returns:

A numpy record array of your metadata, with only the columns specified. Roughly equivalent to: metadata_table[idxs][fields].as_array() where metadata_table is the astropy table that the HyraxDataset (or derived class) was constructed with.

Return type:

npt.ArrayLike

Raises:

RuntimeError – When none of the provided fields are

fetch_dataset_class(class_name: str) type[HyraxDataset][source]#

Fetch the dataset class from the registry.

Parameters:

class_name (str) – The name of the dataset class to fetch. Either the class name of a built in dataset, or the fully qualified name of a user-defined dataset. e.g. “my_module.my_submodule.MyDatasetClass” or “HyraxRandomDataset”.

Returns:

The dataset class.

Return type:

type[HyraxDataset]

Raises:
  • ValueError – If a built in dataset was requested, but not found in the registry.

  • ValueError – If no dataset was specified in the runtime configuration.

class HyraxImageDataset[source]#

This is a mixin for Image datasets primarily concerned with providing utility functions to allow derived classes to set and apply transformations based on configs.

The various set_*_transform functions stack individual transformations on a single stack

The stack can be applied with apply_transform.

set_function_transform()[source]#
set_crop_transform(cutout_shape=None)[source]#
apply_transform(data_torch)[source]#
_update_transform(new_transform)[source]#
_get_np_function(transform_str: str) collections.abc.Callable[Ellipsis, Any][source]#

_get_np_function. Returns the numpy mathematical function that the supplied string maps to; or raises an error if the supplied string cannot be mapped to a function.

Parameters:

transform_str (str) – The string to me mapped to a numpy function