hyrax.data_sets.data_set_registry
Attributes
Classes
How to make a hyrax dataset: |
|
This is a mixin for Image datasets primarily concerned with providing utility functions to |
Functions
|
Fetch the dataset class from the registry. |
Module Contents
- DATASET_REGISTRY: dict[str, type[HyraxDataset]][source]
- class HyraxDataset(config: dict, metadata_table=None)[source]
How to make a hyrax dataset:
from hyrax.data_sets import HyraxDataset from torch.utils.data import Dataset class MyDataset(HyraxDataset, Dataset): def __init__(self, config: dict): super().__init__(config) def __getitem__(): # Your getitem goes here pass def __len__ (): # Your len function goes here pass
Optional interfaces:
ids()-> Subclasses may override this directly with their own ids function returning a generator of stringsmetadata-> Subclasses may pass an astropy table of metadata to__init__in the superclass. This table of metadata will be available through themetadata_fieldsandmetadatafunctions. If desired, a subclass may override these functions directly rather than using the astropy Table interface.Further documentation is in the Getting started with Hyrax Custom Dataset Classes example notebook.
Overall initialization for all DataSets which saves the config
Subclasses of HyraxDataSet ought call this at the end of their __init__ like:
from hyrax.data_sets import HyraxDataset from torch.utils.data import Dataset class MyDataset(HyraxDataset, Dataset): def __init__(config): <your code> super().__init__(config)
If per tensor metadata is available, it is recommended that dataset authors create an astropy Table of that data, in the same order as their data and pass that metadata_table as shown below:
from hyrax.data_sets import HyraxDataset from torch.utils.data import Dataset from astropy.table import Table class MyDataset(HyraxDataset, Dataset): def __init__(config): <your code> metadata_table = Table(<Your catalog data goes here>) super().__init__(config, metadata_table)
- Parameters:
config (dict, Optional) – The runtime configuration for hyrax
metadata_table (Optional[Table], optional) – An Astropy Table with 1. the metadata columns desired for visualization AND 2. in the order your data will be enumerated.
- classmethod is_iterable()[source]
Returns true if underlying dataset is iterable style, supporting __iter__ vs map style where __getitem__/__len__ are the preferred access methods.
- Returns:
True if underlying dataset is iterable
- Return type:
bool
- classmethod is_map()[source]
Returns true if underlying dataset is map style, supporting __getitem__/__len__ vs iterable where __iter__ is the preferred access method.
- Returns:
True if underlying dataset is map-style
- Return type:
bool
- ids() collections.abc.Generator[str][source]
This is the default IDs function you get when you derive from hyrax Dataset
- Returns:
A generator yielding all the string IDs of the dataset.
- Return type:
Generator[str]
- sample_data() dict[source]
Get a sample from the dataset. This is a convenience function that returns the first sample from the dataset, regardless of whether it is iterable or map-style. Often this will be used to instantiate a model that adjusts its form based on the shape of the data.
- metadata_fields() list[str][source]
Returns a list of metadata fields supported by this object
- Returns:
The column names of the metadata table passed. Empty string if no metadata was provided at during construction of the HyraxDataset (or derived class).
- Return type:
list[str]
- metadata(idxs: numpy.typing.ArrayLike, fields: list[str]) numpy.typing.ArrayLike[source]
Returns a table representing the metadata given an array of indexes and a list of fields.
- Parameters:
idxs (npt.ArrayLike) – The indexes of the relevant tensor objects
fields (list[str]) – The names of the fields you would like returned. All values must be among those returned by metadata_fields()
- Returns:
A numpy record array of your metadata, with only the columns specified. Roughly equivalent to: metadata_table[idxs][fields].as_array() where metadata_table is the astropy table that the HyraxDataset (or derived class) was constructed with.
- Return type:
npt.ArrayLike
- Raises:
RuntimeError – When none of the provided fields are
- fetch_dataset_class(class_name: str) type[HyraxDataset][source]
Fetch the dataset class from the registry.
- Parameters:
class_name (str) – The name of the dataset class to fetch. Either the class name of a built in dataset, or the fully qualified name of a user-defined dataset. e.g. “my_module.my_submodule.MyDatasetClass” or “HyraxRandomDataset”.
- Returns:
The dataset class.
- Return type:
type[HyraxDataset]
- Raises:
ValueError – If a built in dataset was requested, but not found in the registry.
ValueError – If no dataset was specified in the runtime configuration.
- class HyraxImageDataset[source]
This is a mixin for Image datasets primarily concerned with providing utility functions to allow derived classes to set and apply transformations based on configs.
The various set_*_transform functions stack individual transformations on a single stack
The stack can be applied with apply_transform.
- _get_np_function(transform_str: str) Callable[Ellipsis, Any][source]
_get_np_function. Returns the numpy mathematical function that the supplied string maps to; or raises an error if the supplied string cannot be mapped to a function.
- Parameters:
transform_str (str) – The string to me mapped to a numpy function