hyrax.verbs
Submodules
Classes
Verb to insert inference results into a vector database index for fast |
|
Umap latent space points into 2d |
|
Inference verb |
|
Train verb |
|
Verb to create a visualization |
|
Look up an inference result using the ID of a data member |
|
Verb to insert inference results into a vector database index for fast |
|
Base class for all hyrax verbs |
Functions
|
Returns all verbs that are currently registered with a class-based implementation |
|
Returns all verbs that are currently registered |
|
Gives the class object for the named verb |
|
Returns true if the verb has a class based implementation |
Package Contents
- class DatabaseConnection(config)[source]
Bases:
hyrax.verbs.verb_registry.VerbVerb to insert inference results into a vector database index for fast similarity search.
Overall initialization for all verbs that saves the config
- cli_name = 'database_connection'
- add_parser_kwargs
- run(database_dir: pathlib.Path | str | None = None)[source]
Create a connection to the vector database for interactive queries.
- Parameters:
database_dir (str or Path, Optional) – The directory containing the database that will be connected to. If None, attempt to connect to the most recently created …-vector-db-… directory. If specified, it can point to either an empty directory or a directory containing an existing vector database. If the latter, the database will be updated with the new vectors.
- _get_database_type_from_config(database_dir: pathlib.Path)[source]
Internal function that will read a config file from a directory and return the name of the vector database from it. i.e. “chromadb”, “qdrant”.
- Parameters:
database_dir (Path) – The directory containing the vector database and the config file that be used as reference.
- Returns:
The config value for [“vector_db”][“name”] in the reference config.
- Return type:
str
- class Umap(config)[source]
Bases:
hyrax.verbs.verb_registry.VerbUmap latent space points into 2d
Overall initialization for all verbs that saves the config
- cli_name = 'umap'
- add_parser_kwargs
- run(input_dir: pathlib.Path | str | None = None)[source]
Create a umap of a particular inference run
This method loads the latent space representations from an inference run, samples a subset of data points, flattens them if necessary, and then fits a UMAP model. The fitted reducer is then used to transform the entire dataset into a lower-dimensional space.
- Parameters:
input_dir (str or Path, Optional) – The directory containing the inference results.
- Returns:
The method does not return anything but saves the UMAP representations to disk.
- Return type:
None
- _transform_batch(batch_tuple: tuple)[source]
Private helper to transform a single batch
- Parameters:
batch_tuple (tuple()) – first element is the IDs of the batch as a numpy array second element is the inference results to transform as a numpy array with shape (batch_len, N) where N is the total number of dimensions in the inference result. Caller flattens all inference result axes for us.
- Returns:
first element is the ids of the batch as a numpy array second element is the results of running the umap transform on the input as a numpy array.
- Return type:
tuple
- static _log_memory_usage(message: str = '')[source]
Log the current resident set size (RSS) memory usage of the current process in gigabytes.
- Parameters:
message (str, optional) – A descriptive message to include in the log output for context.
Notes
This method is intended for debugging and performance monitoring.
- class Infer(config)[source]
Bases:
hyrax.verbs.verb_registry.VerbInference verb
Overall initialization for all verbs that saves the config
- cli_name = 'infer'
- add_parser_kwargs
- run()[source]
Run inference on a model using a dataset
- Parameters:
config (ConfigDict) – The parsed config file as a nested dict
- static load_model_weights(config, model)[source]
Loads the model weights from a file. Raises RuntimeError if this is not possible due to config, missing or malformed file
- Parameters:
config (ConfigDict) – Full runtime configuration
model (nn.Module) – The model class to load weights into
- class Train(config)[source]
Bases:
hyrax.verbs.verb_registry.VerbTrain verb
Overall initialization for all verbs that saves the config
- cli_name = 'train'
- add_parser_kwargs
- class Visualize(config)[source]
Bases:
hyrax.verbs.verb_registry.VerbVerb to create a visualization
Overall initialization for all verbs that saves the config
- cli_name = 'visualize'
- add_parser_kwargs
- run(input_dir: pathlib.Path | str | None = None, *, return_verb: bool = False, make_lupton_rgb_opts: dict | None = None, **kwargs)[source]
Generate an interactive notebook visualization of a latent space that has been umapped down to 2d.
The plot contains two holoviews objects, a scatter plot of the latent space, and a table of objects which can be populated by selecting from the scatter plot.
- Parameters:
input_dir (Optional[Union[Path, str]], optional) – Directory holding the output from the ‘umap’ verb, by default None. When not provided, we use [results][inference_dir] from config. If that’s false; we the most recent umap in the current results directory.
return_verb (bool, optional) – If True, also return the underlying Visualize instance for post-hoc access to selection state. Defaults to False.
make_lupton_rgb_opts (dict, optional) – Dictionary of options to pass to astropy’s make_lupton_rgb function for RGB image creation. Default is {“stretch”: 5, “Q”: 8}. Common parameters include stretch (brightness/contrast) and Q (softening parameter for asinh transformation).
kwargs – Keyword arguments are passed through as options for the plot object as
plot_pane.opts(**plot_options). It is not recommended to override the “tools” plot option, because that will break the integration between the plot selection operations and the table.
- Returns:
Holoviews, if return_verb = True (defaul) – A Collection of Haloviews Panes
tuple of (pane, Visualize), if return_verb = True – Returns a 2-tuple with the pane and the verb instance.
- visible_points(x_range: tuple | list, y_range: tuple | list)[source]
Generate a hv.Points object with the points inside the bounding box passed.
This is the event handler for moving or scaling the latent space plot, and is called by Holoviews.
- Parameters:
x_range (tuple or list) – min and max x values
y_range (tuple or list) – min and max y values
- Returns:
Points lying inside the bounding box passed
- Return type:
hv.Points
- update_points(**kwargs) None[source]
This is the main UI event handler for selection tools on the plot. If you are a dynamic map in the layout of the visualizer who updates based on plot selection you MUST call this function.
This function accepts the data values from all streams and uses the differences between the current call and prior calls to differentiate between different UI events.
The self.prev_kwargs dictionary is used to store previous calls to this function, and the
_called_*helpers perform the differencing for each case.Calling this function GUARANTEES that self.points, self.points_id, and self.points_idx are up-to-date with the user’s latest selection, regardless of the order that Holoviews evaluates the DynamicMaps in.
- poly_select_points(geometry) tuple[numpy.typing.ArrayLike, numpy.typing.ArrayLike, numpy.typing.ArrayLike][source]
Select points inside a polygon.
- Parameters:
geometry (list) – List of x/y points describing the verticies of the polygon
- Returns:
First element is an ndarray of x/y points in latent space inside the polygon Second element is an ndarray of corresponding object ids
- Return type:
Tuple
- box_select_points(x_range: tuple | list, y_range: tuple | list) tuple[numpy.typing.ArrayLike, numpy.typing.ArrayLike, numpy.typing.ArrayLike][source]
Return the points and IDs for a box in the latent space
- Parameters:
x_range (tuple or list) – min and max x values
y_range (tuple or list) – min and max y values
- Returns:
First element is an ndarray of x/y points in latent space inside the box Second element is an ndarray of corresponding object ids
- Return type:
Tuple
- box_select_indexes(x_range: tuple | list, y_range: tuple | list)[source]
Return the indexes inside of a particular box in the latent space
- Parameters:
x_range (tuple or list) – min and max x values
y_range (tuple or list) – min and max y values
- Returns:
Array of data indexes where the latent space representation falls inside the given box.
- Return type:
np.ndarray
- selected_objects(**kwargs)[source]
Generate the holoview table for a selected set of objects based on input from the Lasso, Tap, and SelectionXY streams.
- Returns:
Table with Object ID, x, y locations of the selected objects
- Return type:
hv.Table
- class Lookup(config)[source]
Bases:
hyrax.verbs.verb_registry.VerbLook up an inference result using the ID of a data member
Overall initialization for all verbs that saves the config
- cli_name = 'lookup'
- add_parser_kwargs
- static setup_parser(parser: argparse.ArgumentParser)[source]
Set up our arguments by configuring a subparser
- Parameters:
parser (ArgumentParser) – The sub-parser to configure
- run_cli(args: argparse.Namespace | None = None)[source]
Entrypoint to Lookup from the CLI.
- Parameters:
args (Optional[Namespace], optional) – The parsed command line arguments
- run(id: str, results_dir: pathlib.Path | str | None = None) numpy.ndarray | None[source]
Lookup the latent-space representation of a particular ID
Requires the relevant dataset to be configured, and for inference to have been run.
- Parameters:
id (str) – The ID of the input data to look up the inference result
results_dir (str, Optional) – The directory containing the inference results.
- Returns:
The output tensor of the model for the given input.
- Return type:
Optional[np.ndarray]
- class SaveToDatabase(config)[source]
Bases:
hyrax.verbs.verb_registry.VerbVerb to insert inference results into a vector database index for fast similarity search.
Overall initialization for all verbs that saves the config
- cli_name = 'save_to_database'
- add_parser_kwargs
- run(input_dir: pathlib.Path | str | None = None, output_dir: pathlib.Path | str | None = None)[source]
Insert inference results into vector database.
- Parameters:
input_dir (str or Path, Optional) – The directory containing the inference results.
output_dir (str or Path, Optional) – The directory where the vector database is stored. If None, a new directory will be created. If specified, it can point to either an empty directory or a directory containing an existing vector database. If the latter, the database will be updated with the new vectors.
- class Verb(config)[source]
Bases:
abc.ABCBase class for all hyrax verbs
Overall initialization for all verbs that saves the config
- add_parser_kwargs: dict[str, str]
- config
- all_class_verbs() list[str][source]
Returns all verbs that are currently registered with a class-based implementation