Hyperparameter Tuning with Hyrax

Hyperparameter Tuning with Hyrax#

This notebook demonstrates hyperparameter tuning with Hyrax using the HyraxCNN model and the CIFAR-10 dataset.

The hyperparameters tuned are:

  • Learning rate (sampled from 0.01–0.1)

  • Learning rate scheduler:

    • ExponentialLR with gamma=1.0

    • ExponentialLR with gamma=0.9

    • CosineAnnealingLR with T_max=10

Models are evaluated using validation loss at the final epoch.

This notebook demonstrates two popular hyperparameter tuning libraries:

  • Optuna

  • Hyperopt

Setup common configurations#

We’ll use the same model and data request for all examples.

[1]:
model_name = "HyraxCNN"

data_request = {
    "train": {
        "data": {
            "dataset_class": "HyraxCifarDataset",
            "data_location": "./data",
            "fields": ["image", "label"],
            "primary_id_field": "object_id",
            "split_fraction": 0.8,
        },
    },
    "validate": {
        "data": {
            "dataset_class": "HyraxCifarDataset",
            "data_location": "./data",
            "fields": ["image", "label"],
            "primary_id_field": "object_id",
            "split_fraction": 0.2,
        },
    },
}

optimizer = "torch.optim.Adam"

Optuna example#

Optuna documentation - https://optuna.org/

[ ]:
# % pip install optuna
[ ]:
import optuna
from hyrax import Hyrax

h = Hyrax()
h.set_config("model.name", model_name)
h.set_config("data_request", data_request)
h.set_config("optimizer.name", optimizer)


def objective(trial):
    lr = trial.suggest_float("learning_rate", 0.01, 0.1)
    lr_scheduler = trial.suggest_categorical(
        "lr_scheduler",
        [
            ("torch.optim.lr_scheduler.ExponentialLR", {"gamma": 1.0}),
            ("torch.optim.lr_scheduler.ExponentialLR", {"gamma": 0.9}),
            ("torch.optim.lr_scheduler.CosineAnnealingLR", {"T_max": 10}),
        ],
    )

    h.set_config(f"'{optimizer}'.lr", lr)
    h.set_config("scheduler.name", lr_scheduler[0])
    h.set_config(f"'{lr_scheduler[0]}'", lr_scheduler[1])

    model = h.train()

    return model.final_validation_metrics["loss"]


study = optuna.create_study()
study.optimize(objective, n_trials=6)
[8]:
print(f"Best trial: {study.best_trial.number}")
print(f"Best parameters: {study.best_params}")
print(f"Best value: {study.best_value}")

print("All trials:")
for trial in study.trials:
    print(f" - Trial {trial.number}: Value: {trial.value}, Params: {trial.params}")
Best trial: 3
Best parameters: {'learning_rate': 0.023684100732004, 'lr_scheduler': ('torch.optim.lr_scheduler.ExponentialLR', {'gamma': 0.9})}
Best value: 1.5635895729064941
All trials:
 - Trial 0: Value: 2.302615165710449, Params: {'learning_rate': 0.03147371061606545, 'lr_scheduler': ('torch.optim.lr_scheduler.CosineAnnealingLR', {'T_max': 10})}
 - Trial 1: Value: 2.3012354373931885, Params: {'learning_rate': 0.04234299623942146, 'lr_scheduler': ('torch.optim.lr_scheduler.ExponentialLR', {'gamma': 0.9})}
 - Trial 2: Value: 2.306378126144409, Params: {'learning_rate': 0.03353047133712538, 'lr_scheduler': ('torch.optim.lr_scheduler.ExponentialLR', {'gamma': 1.0})}
 - Trial 3: Value: 1.5635895729064941, Params: {'learning_rate': 0.023684100732004, 'lr_scheduler': ('torch.optim.lr_scheduler.ExponentialLR', {'gamma': 0.9})}
 - Trial 4: Value: 2.302015542984009, Params: {'learning_rate': 0.0502613280276386, 'lr_scheduler': ('torch.optim.lr_scheduler.CosineAnnealingLR', {'T_max': 10})}
 - Trial 5: Value: 2.303622245788574, Params: {'learning_rate': 0.0976624237858605, 'lr_scheduler': ('torch.optim.lr_scheduler.CosineAnnealingLR', {'T_max': 10})}

The loss curves for each of the examples.

optuna_hparam_tuning_loss

Hyperopt example#

Hyperopt documentation: https://hyperopt.github.io/hyperopt/

[5]:
# % pip install hyperopt
[ ]:
from hyperopt import fmin, tpe, hp, STATUS_OK, Trials, space_eval
from hyrax import Hyrax

h = Hyrax()
h.set_config("model.name", model_name)
h.set_config("data_request", data_request)
h.set_config("optimizer.name", optimizer)


def objective(config):
    h.set_config(f"'{optimizer}'.lr", config["learning_rate"])
    h.set_config("scheduler.name", config["lr_scheduler"][0])
    h.set_config(f"'{config['lr_scheduler'][0]}'", config["lr_scheduler"][1])

    model = h.train()
    loss = model.final_validation_metrics["loss"]

    return {"loss": loss, "status": STATUS_OK}


search_space = {
    "learning_rate": hp.uniform("learning_rate", 0.01, 0.1),
    "lr_scheduler": hp.choice(
        "lr_scheduler",
        [
            ("torch.optim.lr_scheduler.ExponentialLR", {"gamma": 1.0}),
            ("torch.optim.lr_scheduler.ExponentialLR", {"gamma": 0.9}),
            ("torch.optim.lr_scheduler.CosineAnnealingLR", {"T_max": 10}),
        ],
    ),
}

trials = Trials()
best = fmin(objective, search_space, algo=tpe.suggest, max_evals=6, trials=trials)

best_params = space_eval(search_space, best)
[17]:
print(f"Best parameters: {best_params}")
print(f"Best loss: {min(t['result']['loss'] for t in trials.trials)}")
print(f"All trials:")
for trial in trials.trials:
    print(f" - Trial {trial['tid']}: Loss: {trial['result']['loss']}, Params: {trial['misc']['vals']}")
Best parameters: {'learning_rate': 0.010009959405368939, 'lr_scheduler': ('torch.optim.lr_scheduler.CosineAnnealingLR', {'T_max': 10})}
Best loss: 1.2757296562194824
All trials:
 - Trial 0: Loss: 2.299847364425659, Params: {'learning_rate': [np.float64(0.08189826955133431)], 'lr_scheduler': [np.int64(1)]}
 - Trial 1: Loss: 1.3674860000610352, Params: {'learning_rate': [np.float64(0.013265866062694177)], 'lr_scheduler': [np.int64(1)]}
 - Trial 2: Loss: 2.300433397293091, Params: {'learning_rate': [np.float64(0.08483851117077196)], 'lr_scheduler': [np.int64(0)]}
 - Trial 3: Loss: 1.2757296562194824, Params: {'learning_rate': [np.float64(0.010009959405368939)], 'lr_scheduler': [np.int64(2)]}
 - Trial 4: Loss: 2.310126781463623, Params: {'learning_rate': [np.float64(0.05718402283604428)], 'lr_scheduler': [np.int64(0)]}
 - Trial 5: Loss: 2.3025949001312256, Params: {'learning_rate': [np.float64(0.08652152480094646)], 'lr_scheduler': [np.int64(2)]}

The loss curves for each of the examples.

hyperopt_hparam_tuning_loss