The ``Hyrax`` Configuration System
==================================

``hyrax`` makes extensive use of the config variables to manage the runtime environment of training and inference runs. There is a ``hyrax_default_config.toml``  file (full contents listed :ref:`here<complete_default_config>`), included with ``hyrax``, that contains every variable that ``hyrax`` could need to operate. To create a custom configuration file, simply create a ``.toml`` file and change variables as you see fit, or if you’re running with a custom dataset or model, add your own variables.
For a practical walkthrough that starts from a minimal override file, see :doc:`configuration`.

Config variables are inherited from a hierarchy of sources, similar to ``python`` classes. First, ``hyrax`` will read the variables set in the default configuration. Next, it will load the relevant default config of any custom ``hyrax`` packages that the user is utilizing (see :doc:`/external_library_package` for how to set up package-level defaults). It determines what packages to include by checking what custom classes are loaded in initially and looking for the relevant default configs. If a package doesn’t have a default, ``hyrax`` will throw a warning. Finally, it will use whatever variables have been declared in the user defined config toml (see the :doc:`config basics notebook </notebooks/config_basics>` for how to load those through a notebook or script). Config variables at each step can overwrite config variables from previous steps which leads to the following priority:
- Variables from a user defined config toml are used
- Default configs from custom ``hyrax`` packages are used for those variables which the user has not defined
- The base default config is used for those variables which the user has not defined and don't exist in any packages

.. figure:: _static/hyrax_config_system.png
   :width: 100%
   :alt: The inheritance hierarchy of the hyrax configuration system.

``hyrax`` will pass along all the configuration variables to the relevant models and dataset classes and allows them to configure the runtime through one system. This allows for extensibility and cross-compatibility within the broader “hyrax ecosystem”. From the point of view of the code, these configuration variables should be static. This makes it easier for researchers to develop code separate from the runtime environment.

A core design principle of ``hyrax`` is "code by config", meaning that all runtime parameters should be set through configuration files rather than hard-coded values. This approach enhances flexibility, reproducibility, and ease of experimentation, as users can modify configurations without altering the underlying codebase. This also facilitates sharing and collaboration, as configurations can be easily shared and adapted for different use cases while keeping fundamental models and datasets consistent.

After running any ``hyrax`` command, a ``runtime_config.toml`` file will be written to the timestamped results directory, which contains the final configuration used for that run.
This file is a combination of all the various source configs (``hyrax`` default, package defaults, and user config) and can be used to see what variables were actually used in one place.

Typed configuration schemas
---------------------------

Hyrax uses Pydantic internally to validate the ``[data_request]`` configuration table,
which describes datasets for training, validation, and inference (see the
:doc:`data requests notebook </notebooks/data_requests>` for a hands-on walkthrough).
This validation helps catch configuration errors early by ensuring required fields
like ``primary_id_field`` are present and properly structured. The exact field-level
expectations for datasets are documented in :doc:`dataset_class_reference`.

The validation happens automatically when you load a TOML configuration or use
``set_config()``. If there are validation errors, Hyrax will log a warning but continue
to use the configuration as-is for backward compatibility.

Backward compatibility for legacy table names is provided by the schema
versioning system (see :ref:`config-schema-versioning` below). For example,
the legacy ``[model_inputs]`` table is automatically renamed to
``[data_request]`` on load and a ``DeprecationWarning`` is emitted.

.. _config-schema-versioning:

Schema versioning
-----------------

When Hyrax loads a user config it will check the ``config_version``.
If the version is lower than the current schema the
document is passed through the migrations registered in
``src/hyrax/config_migrations.py`` before being merged against the defaults.
If ``config_version`` is missing, it is assumed to be the current version and no
schema migrations are applied.

Configs declaring a ``config_version`` higher than the installed Hyrax
understands are refused with a ``RuntimeError`` pointing at
``pip install -U hyrax``.

