hyrax.data_sets.tensor_cache_mixin

Attributes

logger

Classes

TensorCacheMixin

Mixin class providing in-memory tensor caching functionality for datasets.

Module Contents

logger[source]
class TensorCacheMixin[source]

Bases: abc.ABC

Mixin class providing in-memory tensor caching functionality for datasets.

This mixin provides: - use_cache: Cache tensors in memory after first load - preload_cache: Preload all tensors in background thread - Efficient tensor cache management with hit/miss tracking - Background preloading with parallel processing

Classes using this mixin must implement: - _load_tensor_for_cache(object_id: str) -> torch.Tensor - ids() -> Generator[str] (iterator over object IDs) - __len__() -> int

_init_tensor_cache(config)[source]

Initialize tensor caching. Call this from __init__ after other setup.

abstractmethod _load_tensor_for_cache(object_id: str)[source]

Load tensor for the given object_id. Must be implemented by subclasses.

Parameters:

object_id (str) – The object ID to load tensor for

Returns:

The loaded tensor

Return type:

torch.Tensor

abstractmethod ids(log_every: int | None = None) collections.abc.Generator[str, None, None][source]

Iterator over all object IDs. Must be implemented by subclasses.

Parameters:

log_every (Optional[int]) – Log progress every N objects

Yields:

str – Object IDs in the dataset

_check_object_id_to_tensor_cache(object_id: str)[source]

Check if tensor is already cached.

_populate_object_id_to_tensor_cache(object_id: str)[source]

Load tensor and populate cache.

_object_id_to_tensor_cached(object_id: str)[source]

Get tensor for object_id with caching support.

Parameters:

object_id (str) – The object_id requested

Returns:

The tensor for the object

Return type:

torch.Tensor

static _determine_numprocs_preload()[source]

Determine number of processes for preloading.

_preload_tensor_cache()[source]

Preload all tensors in the dataset using multiple threads.

_lazy_map_executor(executor: concurrent.futures.Executor, ids: collections.abc.Iterable[str])[source]

Lazy evaluation version of concurrent.futures.Executor.map().

This limits memory usage during preloading by keeping only a small number of tensors in memory at once.

Parameters:
  • executor (concurrent.futures.Executor) – An executor for running futures

  • ids (Iterable[str]) – An iterable list of object IDs

Yields:

Iterator[torch.Tensor] – An iterator over torch tensors, lazily loaded

_log_duration_tensorboard(name: str, start_time: int)[source]

Log a duration to tensorboardX if configured.

Parameters:
  • name (str) – The name of the scalar to log

  • start_time (int) – Start time in nanoseconds from time.monotonic_ns()