The Hyrax Configuration System

The Hyrax Configuration System#

hyrax makes extensive use of the config variables to manage the runtime environment of training and inference runs. There is a hyrax_default_config.toml file (full contents listed here), included with hyrax, that contains every variable that hyrax could need to operate. To create a custom configuration file, simply create a .toml file and change variables as you see fit, or if you’re running with a custom dataset or model, add your own variables. For a practical walkthrough that starts from a minimal override file, see Configuration.

Config variables are inherited from a hierarchy of sources, similar to python classes. First, hyrax will read the variables set in the default configuration. Next, it will load the relevant default config of any custom hyrax packages that the user is utilizing (see External package setup for how to set up package-level defaults). It determines what packages to include by checking what custom classes are loaded in initially and looking for the relevant default configs. If a package doesn’t have a default, hyrax will throw a warning. Finally, it will use whatever variables have been declared in the user defined config toml (see the config basics notebook for how to load those through a notebook or script). Config variables at each step can overwrite config variables from previous steps which leads to the following priority: - Variables from a user defined config toml are used - Default configs from custom hyrax packages are used for those variables which the user has not defined - The base default config is used for those variables which the user has not defined and don’t exist in any packages

The inheritance hierarchy of the hyrax configuration system.

hyrax will pass along all the configuration variables to the relevant models and dataset classes and allows them to configure the runtime through one system. This allows for extensibility and cross-compatibility within the broader “hyrax ecosystem”. From the point of view of the code, these configuration variables should be static. This makes it easier for researchers to develop code separate from the runtime environment.

A core design principle of hyrax is “code by config”, meaning that all runtime parameters should be set through configuration files rather than hard-coded values. This approach enhances flexibility, reproducibility, and ease of experimentation, as users can modify configurations without altering the underlying codebase. This also facilitates sharing and collaboration, as configurations can be easily shared and adapted for different use cases while keeping fundamental models and datasets consistent.

After running any hyrax command, a runtime_config.toml file will be written to the timestamped results directory, which contains the final configuration used for that run. This file is a combination of all the various source configs (hyrax default, package defaults, and user config) and can be used to see what variables were actually used in one place.

Typed configuration schemas#

Hyrax uses Pydantic internally to validate the [data_request] configuration table, which describes datasets for training, validation, and inference (see the data requests notebook for a hands-on walkthrough). This validation helps catch configuration errors early by ensuring required fields like primary_id_field are present and properly structured. The exact field-level expectations for datasets are documented in Dataset class reference.

The validation happens automatically when you load a TOML configuration or use set_config(). If there are validation errors, Hyrax will log a warning but continue to use the configuration as-is for backward compatibility.

Backward compatibility for legacy table names is provided by the schema versioning system (see Schema versioning below). For example, the legacy [model_inputs] table is automatically renamed to [data_request] on load and a DeprecationWarning is emitted.

Schema versioning#

When Hyrax loads a user config it will check the config_version. If the version is lower than the current schema the document is passed through the migrations registered in src/hyrax/config_migrations.py before being merged against the defaults. If config_version is missing, it is assumed to be the current version and no schema migrations are applied.

Configs declaring a config_version higher than the installed Hyrax understands are refused with a RuntimeError pointing at pip install -U hyrax.