hyrax.datasets.data_cache#
Attributes#
Classes#
Per-dataset caching layer for DataProvider. |
Module Contents#
- class DataCache(config: dict, datasets: dict[str, hyrax.datasets.dataset_registry.HyraxDataset], augment_active: dict[str, bool])[source]#
Per-dataset caching layer for DataProvider.
Each dataset (friendly name) gets two cache maps:
base cache — keyed by
real_idx(an int), stores the result ofget_<field>calls. No dataset method is called to produce the key.augment cache — keyed by the return value of the dataset’s
augment_cache_keymethod, stores augmented results. Only populated when the dataset opts in by returning a non-None key.
try_fetchchecks the augment cache first (when applicable), then falls back to the base cache.One config controls this functionality:
h.config["data_set"]["use_cache"]— when True, data dicts are cached after the first access so subsequent accesses are served from memory.Initialize the DataCache.
- Parameters:
config (dict) – The Hyrax configuration.
datasets (dict[str, HyraxDataset]) – Mapping of friendly_name to dataset instance. Used to call
augment_cache_keyfor augmented data caching.augment_active (dict[str, bool]) – Mapping of friendly_name to whether augmentation is active for that dataset. When True,
try_fetchwill check the augment cache before falling back to the base cache.
- try_fetch(friendly_name: str, real_idx: int, rng_seed: numpy.int64 | None = None) tuple[dict | None, bool][source]#
Try to fetch cached data for a single dataset.
When augmentation is active and
rng_seedis provided, this checks the augment cache first. On miss it falls back to the base cache.- Parameters:
friendly_name (str) – The dataset friendly name.
real_idx (int) – The dataset-local index.
rng_seed (np.int64 | None) – The augmentation RNG seed, or None for non-augmented access.
- Returns:
(data, already_augmented)wheredatais the cached field dict orNoneon miss, andalready_augmentedindicates whether the cached data includes augmentation.- Return type:
tuple[dict | None, bool]
- insert_base(friendly_name: str, real_idx: int, data: dict[str, Any])[source]#
Insert base (non-augmented) field data into the cache.
- Parameters:
friendly_name (str) – The dataset friendly name.
real_idx (int) – The dataset-local index (used directly as cache key).
data (dict[str, Any]) – The field data dict to cache.
- insert_augmented(friendly_name: str, real_idx: int, rng_seed: numpy.int64, data: dict[str, Any])[source]#
Insert augmented field data into the cache.
Calls
augment_cache_keyto determine the cache key. If the key isNone, this is a no-op (the dataset opted out of caching augmented data).- Parameters:
friendly_name (str) – The dataset friendly name.
real_idx (int) – The dataset-local index.
rng_seed (np.int64) – The augmentation RNG seed.
data (dict[str, Any]) – The augmented field data dict to cache.