hyrax.verbs.umap#

Attributes#

Classes#

Umap

Umap latent space points into 2d

Module Contents#

logger[source]#
class Umap(config)[source]#

Bases: hyrax.verbs.verb_registry.Verb

Umap latent space points into 2d

__init__()[source]#

Overall initialization for all verbs that saves the config

cli_name = 'umap'[source]#
add_parser_kwargs[source]#
description = 'Transforms the entire dataset into a lower-dimensional space by fitting a UMAP model.'[source]#
static setup_parser(parser: argparse.ArgumentParser)[source]#

Stub of parser setup

run_cli(args: argparse.Namespace | None = None)[source]#

Stub CLI implementation

run(input_dir: pathlib.Path | str | None = None)[source]#

Create a umap of a particular inference run

This method loads the latent space representations from an inference run, samples a subset of data points, flattens them if necessary, and then fits a UMAP model. The fitted reducer is then used to transform the entire dataset into a lower-dimensional space.

Parameters:

input_dir (str or Path, Optional) – The directory containing the inference results.

Returns:

The method does not return anything but saves the UMAP representations to disk.

Return type:

None

_run(input_dir: pathlib.Path | str | None = None)[source]#

See run()

_transform_batch(batch_tuple: tuple)[source]#

Private helper to transform a single batch

Parameters:

batch_tuple (tuple()) – first element is the IDs of the batch as a numpy array second element is the inference results to transform as a numpy array with shape (batch_len, N) where N is the total number of dimensions in the inference result. Caller flattens all inference result axes for us.

Returns:

first element is the ids of the batch as a numpy array second element is the results of running the umap transform on the input as a numpy array.

Return type:

tuple

static _log_memory_usage(message: str = '')[source]#

Log the current resident set size (RSS) memory usage of the current process in gigabytes.

Parameters:

message (str, optional) – A descriptive message to include in the log output for context.

Notes

This method is intended for debugging and performance monitoring.