rics.ml.time_split.settings#

Global settings for the splitting logic.

Classes

auto_flex()

Configuration for the 'auto' Flex logic.

log_split_progress()

Global settings for the log_split_progress()-function.

plot()

Global settings for the plot()-function.

class auto_flex[source]#

Bases: object

Configuration for the ‘auto’ Flex logic.

This class determines how (lo, hi)-tuples are expanded when flexible bounds are enabled.

SANITY_CHECK: bool = True#

If True, use original limits if flexed limits do not pass sanity checks.

classmethod set_level(level: Literal['hour', 'day'], *, start_at: str | Timedelta | timedelta | timedelta64, round_to: str | Timedelta | timedelta | timedelta64, tolerance: str | Timedelta | timedelta | timedelta64) None[source]#

Set a Level used by the auto-flex logic.

Inputs are not verified until an actual split is made.

Parameters:
  • level – The level to set; ‘hour’ or ‘day’.

  • start_at – Span size at which this level starts.

  • round_to – Frequency to round the range limits to.

  • tolerance – Maximum amount by which to alter limits.

Raises:

AttributeError – For unknown level names.

class Level(start_at: str | Timedelta | timedelta | timedelta64, round_to: str | Timedelta | timedelta | timedelta64, tolerance: str | Timedelta | timedelta | timedelta64)[source]#

Bases: NamedTuple

Level type used by auto_flex.

start_at: str | Timedelta | timedelta | timedelta64#

Span size at which this level starts.

round_to: str | Timedelta | timedelta | timedelta64#

Frequency to round the range limits to.

tolerance: str | Timedelta | timedelta | timedelta64#

Maximum amount by which to alter limits.

hour: Level = Level(start_at='6 hours', round_to='hour', tolerance='15 min')#

Conditions under which bounds are rounded to the nearest hour.

Default setting:

Round to hour if the total range is at least 6 hours, but do not move the bounds more than 15 minutes.

day: Level = Level(start_at='2 days', round_to='day', tolerance='3 hours')#

Conditions under which bounds are rounded to the nearest day.

Default setting:

Round to day if the total range is at least 2 days, but do not move the bounds more than 3 hours.

class plot[source]#

Bases: object

Global settings for the plot()-function.

THOUSANDS_SEPARATOR: str = "'"#

Separator to use when printing bar_labels.

THOUSANDS_SEPARATOR_CUTOFF: int = 10000#

Minimum value before bar_labels include a THOUSANDS_SEPARATOR.

ROW_UNIT: str = 'rows'#

Unit to append to the count when displaying number of rows on the bars.

DATA_LABEL: str = 'Data'#

Label of the blue bar.

FUTURE_DATA_LABEL: str = 'Future data'#

Label of the red bar.

DEFAULT_TIME_UNIT: str = 'h'#

Time unit to use by default when bar_labels=True and available=None.

REMOVED_FOLD_STYLE: Dict[str, Any] = {'alpha': 0.35, 'height': 0.6}#

Keyword arguments used to distinguish filtered folds when plotting with show_removed=True.

See matplotlib.pyplot.bar() for details.

class log_split_progress[source]#

Bases: object

Global settings for the log_split_progress()-function.

FOLD_FORMAT: str = "'{start.auto}' <= [schedule: '{mid.auto}' ({mid:%A})] < '{end.auto}'"#

Pretty-printed fold-key for other messages.

  • Only the start, mid, and end keys are available (see DatetimeSplitBounds). You may use <key>.auto to format as a date when time is zero (this is the default).

Sample output.#
'2021-12-30' <= [schedule: '2022-01-04' (Tuesday)] < '2022-01-04 18:00:00'
SECONDS_FORMATTER: str | Callable[[float], str] = 'rics.performance.format_seconds'#

A callable (seconds) -> formatted_seconds.

Both seconds and formatted_seconds will be available to the END_MESSAGE message. If a string is given, the actual callable will be resolved using rics.misc.get_by_full_name().

START_MESSAGE: str = 'Begin fold {n}/{n_splits}: {fold}.'#

Message indicating that the current fold has been yielded to the user.

Has access to all keys from the previous section, as well as:

  • The fold key (see FOLD_FORMAT), and

  • The n key, which is the 1-based position of the fold in splits, and

  • The n_folds key, which is just len(splits).

Sample output.#
 Begin fold 5/7: '2021-12-30' <= [schedule: '2022-01-04' (Tuesday)] < '2022-01-04 18:00:00'.
END_MESSAGE: str = "Finished fold {n}/{n_splits}: [schedule: '{mid.auto}' ({mid:%A})] after {formatted_seconds}."#

Message indicating that the user is done with the current fold.

Has access to all keys from the previous sections, as well as:

  • The seconds key, which is the (fractional) time the user spent in the fold, and

  • The formatted_seconds key, obtained using the SECONDS_FORMATTER.

The value of seconds is obtained using time.perf_counter().

Sample output.#
 Finished fold 5/7: [schedule: '2022-01-04' (Tuesday)] after 5m 21s.
AUTO_DATE_FORMAT = '%Y-%m-%d'#

Short-form timestamp format_spec used by <key>.auto.

AUTO_DATETIME_FORMAT = '%Y-%m-%d %H:%M:%S'#

Long-form timestamp format_spec used by <key>.auto.