hyrax.download_stats

Attributes

logger

Classes

StatRecord

Recording object that represents one or many stats measurements reported by download request threads.

DownloadStats

Subsytem for keeping statistics on downloads. Used as a context manager in worker threads.

Module Contents

logger[source]
class StatRecord(received_at: datetime.datetime, data_start: datetime.datetime, **kwargs)[source]

Recording object that represents one or many stats measurements reported by download request threads.

This object supports creation with a full set of measurements from one thread. It also supports combining instances of the class to create aggregate stats from several measurements.

field_defaults[source]
received_at[source]
data_start[source]
N = 1[source]
static combine(obja: StatRecord, objb: StatRecord | None) StatRecord[source]

Combine two stats object into a new stats object representing the combination of the two.

Stats objects can be the result of n processed data points, so when combining two objects with n and m data points, the new object will indicate it is composed of n+m data points.

Parameters:
  • obja (StatRecord) – The first stats record

  • objb (StatRecord, optional) – The second stats record. If not provided a copy of obja will be returned.

Returns:

A new object with the combination of the two stats objects.

Return type:

StatRecord

_combine(obj: StatRecord) StatRecord[source]

Combine this object with another. Private helper for combine().

_stat_update(key: str, value: int | datetime.timedelta, m: int = 1)[source]

Update a stat within the record object. Automatically averages if the stat key ends with _avg

Parameters:
  • key (str) – The stat key to update.

  • value (Union[int, datetime.timedelta]) – The value to update it with.

  • m (int, optional) – If we are averaging, we need to know how many data points the current average you are passing in represents, by default 1

static _div(num: float, denom: float, default: float = 0.0) float[source]
wall_clock_dur_s() int[source]

The wall clock duration that is represented by this stats object. How long between the beginning of when it was recording to the moment it was reported to the stats system in seconds?

Returns:

The number of seconds

Return type:

int

total_dur_s() int[source]

Total duration of time that a thread was doing data transfer. In the case of multiple threads each thread’s seconds are added.

A stats object can represent an amalgamation of measurements from multiple worker threads. total_dur_s represents the total time data was being transferred added across all threads.

Returns:

The number of seconds

Return type:

int

resp_s() int[source]

The time spent receiving responses from the server, added across all threads.

Returns:

A number of seconds

Return type:

int

data_down_mb() float[source]

The amount of data downloaded by all threads together.

Returns:

A flooating point number of 1024-based megabytes

Return type:

float

down_rate_mb_s() float[source]

The downstream data rate in megabytes per second experienced by the average thread.

data_down_mb/resp_s.

Returns:

A floating point number of 1024-based megabytes per second

Return type:

float

down_rate_mb_s_overall() float[source]

The downstream data rate in megabytes per second expereineced by an aggregate of threads.

data_down_mb/wall_clock_dur_s

Returns:

A floating point number of 1024-based megabytes per second

Return type:

float

req_s() int[source]

The time spent sending requests to the server, added across all threads.

Returns:

A number of seconds

Return type:

int

data_up_mb() float[source]

The amount of data uploaded by all threads together.

Returns:

A flooating point number of 1024-based megabytes

Return type:

float

up_rate_mb_s() float[source]

The upstream data rate in megabytes per second experienced by the average thread.

data_up_mb/resp_s.

Returns:

A floating point number of 1024-based megabytes per second

Return type:

float

snapshot_rate() float[source]

The rate of snapshots downloaded experienced by the average thread.

(snapshots/total_dur_s) * 3600 * 24

Returns:

A floating point number of snapshot images per day

Return type:

float

snapshot_rate_overall() float[source]

The rate of snapshots downloaded by all threads together.

()`snapshots`/wall_clock_dur_s) * 3600 * 24

Returns:

A floating point number of snapshot images per day

Return type:

float

log_summary(log_level: int = logging.INFO, num_threads: int = 1, prefix: str = '')[source]

Log two lines of summary of this stats object at the given log level.

The first line is an overall summary that treats all threads that gave the stats object data as a single unit

The second line is a per-thread summary that produces thread-averaged statistics.

If provided, prefix is appended to the beginning of the log lines emitted.

Parameters:
  • log_level (int) – The logging level to emit the log message at.

  • num_threads (int) – Number of threads actively reporting into the stats object we want to print, by default 1

  • prefix (str) – String to prefix all log messages with. “” by default

class DownloadStats(print_interval_s=60)[source]

Subsytem for keeping statistics on downloads. Used as a context manager in worker threads.

window_size[source]
lock[source]
cumulative_stats = None[source]
stats_window = [][source]
active_threads = 0[source]
num_threads = 0[source]
print_stats = False[source]
print_interval_s = 60[source]
watcher_thread[source]
__enter__()[source]
__exit__(exc_type, exc_value, traceback)[source]
_watcher_thread(log_level)[source]
_print_stats(log_level)[source]
hook(request: urllib.request.Request, request_start: datetime.datetime, response_start: datetime.datetime, response_size: int, chunk_size: int)[source]

This hook is called on each chunk of snapshots downloaded. It is called immediately after the server has finished responding to the request, so datetime.datetime.now() is the end moment of the request

Parameters:
  • request (urllib.request.Request) – The request object relevant to this call

  • request_start (datetime.datetime) – The moment the request was handed off to urllib.request.urlopen()

  • response_start (datetime.datetime) – The moment there were bytes from the server to process

  • response_size (int) – The size of the response from the server in bytes

  • chunk_size (int) – The number of cutout files recieved in this request