hyrax.verbs.reduction_algorithms.algorithm_registry#

Attributes#

Classes#

ReductionAlgorithm

Abstract base class for all reduction algorithms.

Functions#

is_reducer_class(→ bool)

Returns true if the reducer algorithm has a class based implementation

fetch_reducer_class(→ type[ReductionAlgorithm])

Fetch the class implementing the reducer algorithm specified.

Module Contents#

logger[source]#
ALGORITHM_REGISTRY: dict[str, type[ReductionAlgorithm]][source]#
class ReductionAlgorithm(config: dict, reduction_results: ResultDatasetWriter | None = None)[source]#

Abstract base class for all reduction algorithms.

_config[source]#
_reduction_results = None[source]#
reducer = None[source]#
property config[source]#

Return the configuration dictionary for this reduction algorithm.

property reduction_results[source]#

Return the result dataset writer for this reduction algorithm.

classmethod __init_subclass__()[source]#
fit(data_sample: numpy.ndarray)[source]#

Fit the reduction algorithm to the data. Set the internal state of the reducer based on the provided data sample.

Parameters:

data_sample (numpy.ndarray) – The data sample used to fit the model.

abstractmethod transform(args: dict, num_batches: int)[source]#

Transform the data with a fitted reducer.

Parameters:
  • args (dict) – A dictionary containing the data to be transformed.

  • num_batches (int) – The total number of batches that the data is split into for transformation.

save_model(model_path: pathlib.Path | str | None = None)[source]#

Save the reducer model to a picklefile.

Parameters:

model_path (Path or str) – The path to save the model to.

load_model(expected_input_dim: int, model_path: pathlib.Path | str | None = None)[source]#

Load the reducer model from a file.

Parameters:
  • expected_input_dim (int) – The expected number of input features for the loaded model.

  • model_path (Path or str, optional) – The path to the file to load the model from.

Returns:

The reduction algorithm instance with the loaded model.

Return type:

ReductionAlgorithm

_load_pickle(model_path: pathlib.Path | str)[source]#

Helper function to wrap loading a pickle file from a given path for easier testing.

Parameters:

model_path (str or Path) – The file path to the pickle file.

Returns:

The object loaded from the pickle file.

Return type:

object

_transform_batch(batch_tuple: tuple)[source]#

Private helper to transform a single batch with fitted reducer.

Parameters:

batch_tuple (tuple()) – first element is the IDs of the batch as a numpy array second element is the inference results to transform as a numpy array with shape (batch_len, N) where N is the total number of dimensions in the inference result. Caller flattens all inference result axes for us.

Returns:

first element is the ids of the batch as a numpy array second element is the results of running the transform on the input as a numpy array.

Return type:

tuple

static _log_memory_usage(message: str = '')[source]#

Log the current resident set size (RSS) memory usage of the current process in gigabytes.

Parameters:

message (str, optional) – A descriptive message to include in the log output for context.

Notes

This method is intended for debugging and performance monitoring.

is_reducer_class(cli_name: str) bool[source]#

Returns true if the reducer algorithm has a class based implementation

Parameters:

cli_name (str) – The name of the reducer algorithm on the command line interface

Returns:

True if the reducer algorithm has a class-based implementation

Return type:

bool

fetch_reducer_class(cli_name: str) type[ReductionAlgorithm][source]#

Fetch the class implementing the reducer algorithm specified. The class must be a subclass of ReductionAlgorithm and must be registered in the ALGORITHM_REGISTRY.

Parameters:

cli_name (str) – The name of the reducer algorithm on the command line interface

Returns:

The class implementing the reducer algorithm.

Return type:

type[ReductionAlgorithm]