hyrax.models

Contents

hyrax.models#

Submodules#

Classes#

HSCAutoencoder

This autoencoder is designed to work with datasets that are prepared with Hyrax's HSC Data Set class.

HSCDCAE

This autoencoder is designed to work with datasets that are prepared with Hyrax's HSC Data Set class.

ImageDCAE

This is an autoencoder with skipconnections that should work with

HyraxAutoencoder

This autoencoder is designed to work with a wide range of image datasets to allow testing.

HyraxAutoencoderV2

This is tweaked version of HyraxAutoencoder and is designed to work with a wide range of imaging datasets.

HyraxCNN

This CNN is designed to work with datasets that are prepared with Hyrax's HSC Data Set class.

HyraxLoopback

Simple model for testing which returns its own input

SimCLR

SimCLR model. Implementation based on Chen, 2020

Functions#

hyrax_model(cls)

Decorator to register a model with the model registry, and to add common interface functions

Package Contents#

class HSCAutoencoder(config, data_sample=None)[source]#

Bases: torch.nn.Module

This autoencoder is designed to work with datasets that are prepared with Hyrax’s HSC Data Set class.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

encoder#
decoder#
config#
forward(x)[source]#
train_batch(batch)[source]#

This function contains the logic for a single training step. i.e. the contents of the inner loop of a ML training process.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Current loss value – Dictionary containing the loss value for the current batch.

Return type:

dict

validate_batch(batch)[source]#

This function contains the logic for a single validation step that will process a single batch of data. i.e. the contents of the inner loop of a ML validation process.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Current loss value – Dictionary containing the loss value for the current batch.

Return type:

dict

test_batch(batch)[source]#

This function contains the logic for a single testing step that will process a single batch of data. i.e. the contents of the inner loop of a ML testing process. In this case, it is identical to validate_batch.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Current loss value – Dictionary containing the loss value for the current batch.

Return type:

dict

infer_batch(batch)[source]#

This function contains the logic for a single inference step that will process a single batch of data. i.e. the contents of the inner loop of a ML inference process.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Reconstructed outputs – The reconstructed outputs from the autoencoder.

Return type:

torch.Tensor

class HSCDCAE(config, data_sample=None)[source]#

Bases: torch.nn.Module

This autoencoder is designed to work with datasets that are prepared with Hyrax’s HSC Data Set class.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

encoder1#
encoder2#
encoder3#
encoder4#
pool#
decoder4#
decoder3#
decoder2#
decoder1#
activation#
config#
forward(x)[source]#
train_batch(batch)[source]#

This function contains the logic for a single training step. i.e. the contents of the inner loop of a ML training process.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Current loss value – Dictionary containing the loss value for the current batch.

Return type:

dict

validate_batch(batch)[source]#

This function contains the logic for a single validation step that will process a single batch of data. i.e. the contents of the inner loop of a ML validation process.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Current loss value – Dictionary containing the loss value for the current batch.

Return type:

dict

test_batch(batch)[source]#

This function contains the logic for a single testing step that will process a single batch of data. i.e. the contents of the inner loop of a ML testing process. In this case, it is identical to validate_batch.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Current loss value – Dictionary containing the loss value for the current batch.

Return type:

dict

infer_batch(batch)[source]#

This function contains the logic for a single inference step that will process a single batch of data. i.e. the contents of the inner loop of a ML inference process.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Reconstructed outputs – The reconstructed outputs from the autoencoder.

Return type:

torch.Tensor

class ImageDCAE(config, data_sample=None)[source]#

Bases: torch.nn.Module

This is an autoencoder with skipconnections that should work with arbitarily sized images with arbitrary number of channels.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

input_shape#
config#
latent_dim#
base_channel_size#
conv_output_size#
encoder1#
encoder2#
encoder3#
encoder4#
pool#
latent_encoder#
latent_decoder#
decoder4#
decoder3#
decoder2#
decoder1#
activation#
_calculate_conv_output_size()[source]#

Calculate the output size after all convolutional layers for the linear bottleneck.

encode(x)[source]#

Encode input to latent space with skip connections.

decode(latent, skip_connections, encoded_shape)[source]#

Decode from latent space to image with skip connections.

forward(x)[source]#

Forward pass - returns latent representation for anomaly detection.

reconstruct(x)[source]#

Full reconstruction for evaluation and anomaly detection.

train_batch(batch)[source]#

This function contains the logic for a single training step.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Current loss value – Dictionary containing the loss value for the current batch.

Return type:

dict

validate_batch(batch)[source]#

This function contains the logic for a single validation step that will process a single batch of data.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Current loss value – Dictionary containing the loss value for the current batch.

Return type:

dict

test_batch(batch)[source]#

This function contains the logic for a single testing step that will process a single batch of data. In this case, it is identical to validate_batch.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Current loss value – Dictionary containing the loss value for the current batch.

Return type:

dict

infer_batch(batch)[source]#

This function contains the logic for a single inference step.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Reconstructed images – Tensor containing the reconstructed images for the current batch.

Return type:

torch.Tensor

static prepare_inputs(data_dict)[source]#

Extract the image array from the batch dictionary.

This static method is the interface between the data pipeline and the model. Override it on the model class to reshape or select fields from the collated batch to match the inputs your model expects.

Hyrax will convert the returned array to a PyTorch tensor and move it to the appropriate device automatically.

Parameters:

data_dict (dict) – The collated batch dictionary produced by the data pipeline. Expected to contain a "data" key with an "image" field.

Returns:

image – The image array extracted from the batch.

Return type:

numpy.ndarray

class HyraxAutoencoder(config, data_sample=None)[source]#

Bases: torch.nn.Module

This autoencoder is designed to work with a wide range of image datasets to allow testing.

This example model is taken from this autoenocoder tutorial

The train function has been converted into train_batch for use with pytorch-ignite.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

config#
c_hid#
latent_dim#
conv_end_w#
conv_end_h#
conv2d_multi_layer(input_size, num_applications, **kwargs) int[source]#
conv2d_output_size(input_size, kernel_size, padding=0, stride=1, dilation=1) int[source]#
_init_encoder()[source]#
_eval_encoder(x)[source]#
_init_decoder()[source]#
_eval_decoder(x)[source]#
forward(batch)[source]#
train_batch(batch)[source]#

This function contains the logic for a single training step. i.e. the contents of the inner loop of a ML training process.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Current loss value – Dictionary containing the loss value for the current batch.

Return type:

dict

validate_batch(batch)[source]#

This function contains the logic for a single validation step that will process a single batch of data. i.e. the contents of the inner loop of a ML validation process.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Current loss value – Dictionary containing the loss value for the current batch.

Return type:

dict

test_batch(batch)[source]#

This function contains the logic for a single testing step that will process a single batch of data. i.e. the contents of the inner loop of a ML testing process. In this case, it is identical to validate_batch.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Current loss value – Dictionary containing the loss value for the current batch.

Return type:

dict

infer_batch(batch)[source]#

This function contains the logic for a single inference step. i.e. the contents of the inner loop of a ML inference process.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Reconstructed inputs – The reconstructed inputs from the autoencoder.

Return type:

torch.Tensor

static prepare_inputs(data_dict) tuple[source]#

Extract the image array from the batch dictionary.

This static method is the interface between the data pipeline and the model. Override it on the model class to reshape or select fields from the collated batch to match the inputs your model expects.

Hyrax will convert the returned array to a PyTorch tensor and move it to the appropriate device automatically.

Parameters:

data_dict (dict) – The collated batch dictionary produced by the data pipeline. Expected to contain a "data" key with an "image" field.

Returns:

image – The image array extracted from the batch.

Return type:

numpy.ndarray

_optimizer()[source]#
class HyraxAutoencoderV2(config, data_sample=None)[source]#

Bases: torch.nn.Module

This is tweaked version of HyraxAutoencoder and is designed to work with a wide range of imaging datasets.

V2 improvements: - Configurable final layer activation - Uses criterion and optimizer from config variables

Initialize internal Module state, shared by both nn.Module and ScriptModule.

config#
c_hid#
latent_dim#
conv_end_w#
conv_end_h#
band_reduction#
conv2d_multi_layer(input_size, num_applications, **kwargs) int[source]#
conv2d_output_size(input_size, kernel_size, padding=0, stride=1, dilation=1) int[source]#
_init_encoder()[source]#
_eval_encoder(x)[source]#
_init_decoder()[source]#
_eval_decoder(x)[source]#
forward(batch)[source]#
train_batch(batch)[source]#

This function contains the logic for a single training step. i.e. the contents of the inner loop of a ML training process.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Current loss value – Dictionary containing the loss value for the current batch.

Return type:

dict

validate_batch(batch)[source]#

This function contains the logic for a single validation step that will process a single batch of data. i.e. the contents of the inner loop of a ML validation process.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Current loss value – Dictionary containing the loss value for the current batch.

Return type:

dict

test_batch(batch)[source]#

This function contains the logic for a single testing step that will process a single batch of data. i.e. the contents of the inner loop of a ML testing process. In this case, it is identical to validate_batch.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Current loss value – Dictionary containing the loss value for the current batch.

Return type:

dict

infer_batch(batch)[source]#

This function contains the logic for a single inference step. i.e. the contents of the inner loop of a ML inference process.

Parameters:

batch (tuple) – A tuple containing the input data for the current batch, possibly with labels that are ignored.

Returns:

Reconstructed outputs – The reconstructed outputs from the autoencoder.

Return type:

torch.Tensor

static prepare_inputs(data_dict) tuple[source]#

Extract the image array from the batch dictionary.

This static method is the interface between the data pipeline and the model. Override it on the model class to reshape or select fields from the collated batch to match the inputs your model expects.

Hyrax will convert the returned array to a PyTorch tensor and move it to the appropriate device automatically.

Parameters:

data_dict (dict) – The collated batch dictionary produced by the data pipeline. Expected to contain a "data" key with an "image" field.

Returns:

image – The image array extracted from the batch.

Return type:

numpy.ndarray

class HyraxCNN(config, data_sample=None)[source]#

Bases: torch.nn.Module

This CNN is designed to work with datasets that are prepared with Hyrax’s HSC Data Set class.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

config#
conv1#
pool#
conv2#
fc1#
fc2#
fc3#
conv2d_output_size(input_size, kernel_size, padding=0, stride=1, dilation=1) int[source]#
pool2d_output_size(input_size, kernel_size, stride, padding=0, dilation=1) int[source]#
forward(x)[source]#
train_batch(batch)[source]#

This function contains the logic for a single training step that will process a single batch of data. i.e. the contents of the inner loop of a ML training process.

Parameters:

batch (tuple) – A tuple containing the inputs and labels for the current batch.

Returns:

Current loss value – Dictionary containing the loss value for the current batch.

Return type:

dict

validate_batch(batch)[source]#

This function contains the logic for a single validation step that will process a single batch of data. i.e. the contents of the inner loop of a ML validation process. In this case it is identical to test_batch.

Parameters:

batch (tuple) – A tuple containing the inputs and labels for the current batch.

Returns:

Current loss value – Dictionary containing the loss value for the current batch.

Return type:

dict

test_batch(batch)[source]#

This function contains the logic for a single testing step that will process a single batch of data. i.e. the contents of the inner loop of a ML testing process. In this case, it is identical to validate_batch.

Parameters:

batch (tuple) – A tuple containing the inputs and labels for the current batch.

Returns:

Current loss value – Dictionary containing the loss value for the current batch.

Return type:

dict

infer_batch(batch)[source]#

This function contains the logic for a single inference step that will process a single batch of data. i.e. the contents of the inner loop of a ML inference process.

Parameters:

batch (tuple) – A tuple containing the inputs and labels for the current batch.

Returns:

Model outputs – Tensor containing the model outputs for the current batch.

Return type:

Tensor

static prepare_inputs(data_dict) tuple[source]#

Extract image and label arrays from the batch dictionary.

This static method is the interface between the data pipeline and the model. Override it on the model class to reshape or select fields from the collated batch to match the inputs your model expects.

Hyrax will convert the returned arrays to PyTorch tensors and move them to the appropriate device automatically.

Parameters:

data_dict (dict) – The collated batch dictionary produced by the data pipeline. Expected to contain a "data" key with "image" and optionally "label" fields.

Returns:

inputs – A tuple of (image, label) as float32 and int64 arrays respectively.

Return type:

tuple of numpy.ndarray

class HyraxLoopback(config, data_sample=None)[source]#

Bases: torch.nn.Module

Simple model for testing which returns its own input

Initialize internal Module state, shared by both nn.Module and ScriptModule.

unused_module#
config#
load#
forward(x)[source]#

We simply return our input

train_batch(batch)[source]#

Training is a noop

validate_batch(batch)[source]#

Validation is just a forward pass

test_batch(batch)[source]#

Testing is just a forward pass

infer_batch(batch)[source]#

Inference is just a forward pass

hyrax_model(cls)[source]#

Decorator to register a model with the model registry, and to add common interface functions

Returns:

The class with additional interface functions.

Return type:

type

class SimCLR(config, shape)[source]#

Bases: torch.nn.Module

SimCLR model. Implementation based on Chen, 2020

Initialize internal Module state, shared by both nn.Module and ScriptModule.

config#
shape#
backbone#
projection_head#
criterion#
forward(x)[source]#
train_batch(x)[source]#
validate_batch(x)[source]#
test_batch(x)[source]#
infer_batch(x)[source]#

Function to run inference on a batch of data.

Parameters:

x (torch.Tensor) – Input tensor of shape (batch_size, channels, height, width).

Returns:

Output tensor of shape (batch_size, projection_dimension).

Return type:

torch.Tensor