hyrax.config_utils
==================

.. py:module:: hyrax.config_utils


Attributes
----------

.. autoapisummary::

   hyrax.config_utils.DEFAULT_CONFIG_FILEPATH
   hyrax.config_utils.DEFAULT_USER_CONFIG_FILEPATH
   hyrax.config_utils.KEYS_WITH_EXTERNAL_LIBS
   hyrax.config_utils.KNOWN_LIBS_WITHOUT_DEFAULT_CONFIGS
   hyrax.config_utils.logger


Classes
-------

.. autoapisummary::

   hyrax.config_utils.ConfigManager


Functions
---------

.. autoapisummary::

   hyrax.config_utils.config_help
   hyrax.config_utils.parse_dotted_key
   hyrax.config_utils.find_keys
   hyrax.config_utils.create_results_dir
   hyrax.config_utils.find_most_recent_results_dir
   hyrax.config_utils.resolve_results_dir
   hyrax.config_utils.log_runtime_config


Module Contents
---------------

.. py:data:: DEFAULT_CONFIG_FILEPATH

.. py:data:: DEFAULT_USER_CONFIG_FILEPATH

.. py:data:: KEYS_WITH_EXTERNAL_LIBS
   :value: ['name', 'dataset_class']


.. py:data:: KNOWN_LIBS_WITHOUT_DEFAULT_CONFIGS
   :value: ['torch', 'umap']


.. py:data:: logger

.. py:function:: config_help(config: tomlkit.toml_document.TOMLDocument, *args)

   A simple config help function. It's a bit difficult to parse through
   the Tomlkit Table to print just one item such that it would include the comments
   preceding it.

   For now, we support the following cases, and generally print out the entire
   table for the given key.

   Cases:
   - if no args, prints the whole config.
   - if args[0] is a table name, print the whole table
   - if args[0] is not a table, assume it's a key and search
   -- print each one of the tables that it is found in.

   :param config: A configuration dictionary that will be used to search for specified tables
                  and keys.
   :type config: TOMLDocument
   :param args: A variable number of string arguments that specify the table name or key
                to search for in the configuration dictionary.
   :type args: str


.. py:function:: parse_dotted_key(key: str) -> list[str]

   Parse a dotted key string, respecting quoted sections.

   Quoted sections (using single or double quotes) are treated as a single key
   component, even if they contain dots. This allows for keys like 'torch.optim.Adam'
   to be used as a single table name in TOML configuration files.

   :param key: The dotted key to parse. Examples:
               - "model.name" -> ['model', 'name']
               - "'torch.optim.Adam'.lr" -> ['torch.optim.Adam', 'lr']
               - '"torch.optim.Adam".lr' -> ['torch.optim.Adam', 'lr']
               - "optimizer.'torch.optim.Adam'.lr" -> ['optimizer', 'torch.optim.Adam', 'lr']
   :type key: str

   :returns: A list of key components
   :rtype: list[str]


.. py:function:: find_keys(config: dict[str, Any], key_name: str)

   Recursively find all keys in a nested dictionary that match the given key name.

   :param config: The nested dictionary to search.
   :type config: dict
   :param key_name: The name of the key to find.
   :type key_name: str

   :returns: A list of matching keys.
   :rtype: list


.. py:class:: ConfigManager(runtime_config_filepath: Union[pathlib.Path, str] | None = None, default_config_filepath: Union[pathlib.Path, str] = DEFAULT_CONFIG_FILEPATH)

   A class to manage the runtime configuration for a Hyrax object. This class
   will contain all the logic and methods for reading, merging, and validating
   the runtime configuration.


   .. py:attribute:: _called_from_test
      :value: False


      Hardcoded set of config keys which we know to contain paths, and we resolve to global paths
      during initialization in ConfigManager._resolve_config_paths().


   .. py:attribute:: PATH_CONFIG_KEYS
      :value: [['data_set', 'filter_catalog'], ['general', 'data_dir']]



   .. py:attribute:: PYDANTIC_VALIDATED_KEYS
      :value: ('data_request',)



   .. py:attribute:: hyrax_default_config
      :type:  tomlkit.toml_document.TOMLDocument


   .. py:attribute:: runtime_config_filepath
      :value: None



   .. py:attribute:: user_specific_config


   .. py:attribute:: config


   .. py:attribute:: original_config


   .. py:method:: _render_config(user_specific_config: tomlkit.toml_document.TOMLDocument = None, hyrax_default_config: tomlkit.toml_document.TOMLDocument = None, over_write: bool = False)
      :staticmethod:



   .. py:method:: _set_config(key: str, value: Any, over_write: bool = False)

      Set a config value at runtime. This modifies the in-memory config object.
      Once the configuration is updated, the entire config is re-rendered to
      ensure that any requested external library default configs are incorporated.

      :param key: The dotted key to set, e.g. "model.name" or "'torch.optim.Adam'.lr"
                  Quoted sections (using single or double quotes) are treated as single
                  key components, allowing for table names like 'torch.optim.Adam'.
      :type key: str
      :param value: The value to set the key to.
      :type value: Any
      :param over_write: Whether to allow overwriting existing keys in the config.
                         If True, this method will overwrite the highest level existing values in the config.
                         If False, this method will merge the new setting into the existing ones.
                         By default False.
      :type over_write: bool, optional



   .. py:method:: _validate_data_request(value: Any) -> dict
      :staticmethod:


      Validate and normalize data_request configuration into a plain dictionary.

      This method ensures that the ``data_request`` configuration (which defines
      datasets for training, validation, and inference) is properly validated against
      the DataRequestDefinition schema and converted to a dictionary format suitable
      for internal use.

      :param value: The data_request value to validate. Can be a DataRequestDefinition instance
                    or a dictionary/object that can be validated as one.
      :type value: Any

      :returns: The validated data_request as a plain dictionary with unset values excluded.
      :rtype: dict

      :raises ValidationError: If the value cannot be validated as a DataRequestDefinition.



   .. py:method:: read_runtime_config(config_filepath: Union[pathlib.Path, str] = DEFAULT_CONFIG_FILEPATH) -> tomlkit.toml_document.TOMLDocument
      :staticmethod:


      Read a single toml file and return a TOMLDocument

      :param config_filepath: The path to the config file, by default DEFAULT_CONFIG_FILEPATH
      :type config_filepath: Union[Path, str], optional

      :returns: The contents of the toml file as a tomlkit.TOMLDocument
      :rtype: TOMLDocument



   .. py:method:: _find_external_library_default_config_paths(runtime_config: dict) -> set
      :staticmethod:


      Search for external libraries in the runtime configuration and gather the
      libpath specifications so that we can load the default configs for the libraries.

      :param runtime_config: The runtime configuration as a tomlkit.TOMLDocument.
      :type runtime_config: dict

      :returns: A tuple containing the default configuration Paths for the external
                libraries that are requested in the users configuration file.
      :rtype: set



   .. py:method:: merge_external_default_configs(external_default_config_paths)
      :staticmethod:


      Merge the default configurations from external libraries into the overall
      default configuration.

      :param external_default_config_paths: A set containing the default configuration Paths for the external
                                            libraries that are requested in the users configuration file.
      :type external_default_config_paths: set

      :returns: The merged overall default configuration including the external library defaults.
      :rtype: dict



   .. py:method:: merge_default_configs(hyrax_defaults, external_defaults)
      :staticmethod:


      Merge the default configurations of external libraries on top of the
      Hyrax default configuration.

      :param hyrax_defaults: The default configuration from hyrax.
      :type hyrax_defaults: dict
      :param external_defaults: The default configuration from external libraries.
      :type external_defaults: dict

      :returns: The merged overall default configuration including the external library defaults.
      :rtype: dict



   .. py:method:: merge_configs(base_config: dict, overriding_config: dict, over_write: bool = False) -> dict
      :staticmethod:


      Merge two config dictionaries with the overriding_config values overriding
      the base_config values.

      :param base_config: The base configuration with keys that may be overridden by the
                          overriding_config.
      :type base_config: dict
      :param overriding_config: The new configuration values that will override the values in base_config.
      :type overriding_config: dict
      :param over_write: If True, the overriding_config values will overwrite the base_config values.
                         If False, the overriding_config values will be merged with the base_config values.
                         By default False.
      :type over_write: bool, optional

      :returns: The merged configuration.
      :rtype: dict



   .. py:method:: _validate_runtime_config(runtime_config: dict, default_config: dict)
      :staticmethod:


      Recursive helper to check that all keys in runtime_config have a default
      in the merged default_config.

      The two arguments passed in must represent the same nesting level of the
      runtime config and all default config parameters respectively.

      :param runtime_config: Nested config dictionary representing the runtime config.
      :type runtime_config: dict
      :param default_config: Nested config dictionary representing the defaults
      :type default_config: dict

      :raises RuntimeError: Raised if any config that exists in the runtime config does not have a default defined in
          default_config



   .. py:method:: _resolve_config_paths(runtime_config: dict) -> None
      :staticmethod:


      Convert all paths in a runtime config to global paths in the current environment.
      Uses the hardcoded list of paths in ConfigManager.PATH_CONFIG_KEYS

      This mutates the config dictionary passed.

      :param runtime_config: Current runtime config nested dictionary
      :type runtime_config: dict



   .. py:method:: resolve_runtime_config(runtime_config_filepath: Union[pathlib.Path, str, None] = None) -> pathlib.Path
      :staticmethod:


      Resolve a user-supplied runtime config to where we will actually pull config from.

      #. If a runtime config file is specified, we will use that file.
      #. If no file is specified and there is a file named "hyrax_config.toml" in the cwd we will use it.
      #. If no file is specified and there is no file named "hyrax_config.toml" in the cwd we will
         exclusively work off the configuration defaults in the packaged "hyrax_default_config.toml" file.

      :param runtime_config_filepath: Location of the supplied config file, by default None
      :type runtime_config_filepath: Union[Path, str, None], optional

      :returns: Path to the configuration file ultimately used for config resolution. When we fall back to the
                package supplied default config file, the Path to that file is returned.
      :rtype: Path

      :raises FileNotFoundError: If a runtime config file is specified but does not exist.



.. py:function:: create_results_dir(config: dict, postfix: str) -> pathlib.Path

   Creates a results directory for this run.

   Postfix is the verb name of the run e.g. (infer, train, etc)

   The directory is created within the results dir (set with config results_dir)
   and follows the pattern <timestamp>-<postfix>

   The resulting directory is returned.

   :param config: The full runtime configuration for this run
   :type config: dict
   :param postfix: The verb name of the run.
   :type postfix: str

   :returns: The path created by this function
   :rtype: Path


.. py:function:: find_most_recent_results_dir(config: dict, verb: str) -> pathlib.Path | None

   Find the most recent results directory corresponding to a particular verb
   This is a best effort search in the currently configured results root.

   If result directories are created within 1 second of one another this function
   will return one of the directories but it is undefined which one it will return.

   This function may return None indicating it could not find a directory matching
   the query verb


.. py:function:: resolve_results_dir(config: dict, results_dir: Union[pathlib.Path, str, None], verb: Union[str, None]) -> pathlib.Path

   Resolve the results directory path with auto-discovery support.

   This helper handles auto-discovery of the most recent results directory if not provided.
   It checks the config for an explicit inference_dir first, then falls back to finding
   the most recent directory for the given verb.

   :param config: The hyrax config dictionary
   :type config: dict
   :param results_dir: The results subdirectory to load from. If None, will attempt auto-discovery.
   :type results_dir: Union[Path, str, None]
   :param verb: The name of the verb that generated the results (for auto-discovery).
                Defaults to "infer" if None.
   :type verb: Union[str, None]

   :returns: Resolved path to results directory
   :rtype: Path

   :raises RuntimeError: If results directory cannot be found or does not exist


.. py:function:: log_runtime_config(runtime_config: dict, output_path: pathlib.Path, file_name: str = 'runtime_config.toml')

   Log a runtime configuration.

   :param runtime_config: A dictionary object containing runtime configuration values.
   :type runtime_config: dict
   :param output_path: The path to put the config file
   :type output_path: str
   :param file_name: Optional name for the config file, defaults to "runtime_config.toml"
   :type file_name: str, Optional


