hyrax.datasets.random.hyrax_random_dataset
==========================================

.. py:module:: hyrax.datasets.random.hyrax_random_dataset


Attributes
----------

.. autoapisummary::

   hyrax.datasets.random.hyrax_random_dataset.INVALID_VALUES


Classes
-------

.. autoapisummary::

   hyrax.datasets.random.hyrax_random_dataset.HyraxRandomDatasetBase
   hyrax.datasets.random.hyrax_random_dataset.HyraxRandomDataset


Module Contents
---------------

.. py:data:: INVALID_VALUES

   Mapping of string representation of invalid values to numpy representations.

.. py:class:: HyraxRandomDatasetBase(config, data_location)

   This is the base class for the random datasets provided by Hyrax.

   .. warning::

       Direct use of ``HyraxRandomDatasetBase`` is not advised. When working
       with Hyrax, prefer to use ``HyraxRandomDataset``.

   .. py:method:: __init__(config, data_location)

   Initialize the dataset using the parameters defined in the configuration.

   Parameter included for API consistency with other dataset classes, though
   not used by this implementation. All parameters are controlled by the following
   keys under the ``["data_set"]["HyraxRandomDataset"]`` table in the configuration:

   - ``size``: The number of random data samples to produce.
   - ``shape``: The shape of each random data sample as a tuple (e.g. (3, 29, 29) = 3
     layers of 2D data, each layer is 29x29 elements).
   - ``seed``: The random seed to use for reproducibility.
   - ``provided_labels``: A list of possible labels to randomly select from.
     If this is provided, the dataset will randomly select a label for each data sample.
   - ``metadata_fields``: A list of metadata field names. Used to create a metadata
     table with columns corresponding to each field name. All data is numeric.
   - ``number_invalid_values``: The number of invalid values to insert into the data.
   - ``invalid_value_type``: The type of invalid value to insert into the data.
     Valid values are "nan", "inf", "-inf", "none", or a float value.


   .. py:attribute:: data
      :type:  numpy.ndarray

      The random data samples produced by the dataset.


   .. py:attribute:: id_list
      :type:  list

      A list of sequential numeric IDs for each data sample.


   .. py:attribute:: provided_labels
      :type:  list

      A list of labels randomly selected from the provided list of possible labels.


   .. py:attribute:: data_location


   .. py:method:: get_image(idx: int) -> numpy.ndarray

      Get the image at the given index as a NumPy array.


   .. py:method:: get_label(idx: int) -> str

      Get the label at the given index.


   .. py:method:: get_object_id(idx: int) -> str

      Get the index of the item.


.. py:class:: HyraxRandomDataset(config, data_location)

   Bases: :py:obj:`HyraxRandomDatasetBase`, :py:obj:`hyrax.datasets.dataset_registry.HyraxDataset`, :py:obj:`torch.utils.data.Dataset`


   This dataset is stand-in for a map-style dataset.
   It will produce random numpy arrays along with sequential numeric ids and,
   optionally, labels randomly selected from the provided list of possible labels.

   .. py:method:: __init__(config, data_location)

   Initialize the dataset using the parameters defined in the configuration.

   Parameter included for API consistency with other dataset classes, though
   not used by this implementation. All parameters are controlled by the following
   keys under the ``["data_set"]["HyraxRandomDataset"]`` table in the configuration:

   - ``size``: The number of random data samples to produce.
   - ``shape``: The shape of each random data sample as a tuple (e.g. (3, 29, 29) = 3
     layers of 2D data, each layer is 29x29 elements).
   - ``seed``: The random seed to use for reproducibility.
   - ``provided_labels``: A list of possible labels to randomly select from.
     If this is provided, the dataset will randomly select a label for each data sample.
   - ``metadata_fields``: A list of metadata field names. Used to create a metadata
     table with columns corresponding to each field name. All data is numeric.
   - ``number_invalid_values``: The number of invalid values to insert into the data.
   - ``invalid_value_type``: The type of invalid value to insert into the data.
     Valid values are "nan", "inf", "-inf", "none", or a float value.


   .. py:method:: __getitem__(idx: int) -> dict

      Get a data sample by index.

      The returned dictionary will contain the following keys:

      - ``index``: The index of the data sample.
      - ``object_id``: The ID of the data sample.
      - ``image``: The data sample as a numpy array.
      - ``label``: The label of the data sample (if provided).


      :param idx: The index of the data sample to retrieve.
      :type idx: int

      :returns: A dictionary containing the data sample and its metadata.
      :rtype: dict


   .. py:method:: __len__()

      Get the total number of samples in this dataset. This should be return
      the same value as the `size` parameter in the configuration.