rics.ml.time_split.support#
Supporting functions.
These functions are used internally, but are exposed here as well so that user may create their own logic using the internal logic, or just to test things out.
Warning
Not part of the stable API.
This module may change without notice. Stick to the top-level rics.ml.time_split
-module, or lock down your
dependencies if you need to use the support
module.
Functions
|
Derive the "real" bounds of limits. |
|
Compute fold weights. |
Pretty-print a fold. |
- expand_limits(limits: tuple[Timestamp, Timestamp], *, flex: bool | Literal['auto'] | str | tuple[str | Timedelta | timedelta | timedelta64, str | Timedelta | timedelta | timedelta64, str | Timedelta | timedelta | timedelta64] | Iterable[tuple[str | Timedelta | timedelta | timedelta64, str | Timedelta | timedelta | timedelta64, str | Timedelta | timedelta | timedelta64]] = 'auto') tuple[Timestamp, Timestamp] [source]#
Derive the “real” bounds of limits.
- Parameters:
limits – A tuple
(lo, hi)
of timestamps.flex – Flex arguments as described in the User guide. Also supports level-tuples
[(start_at, round_to, tolerance)...]
. Passingflex=[settings.auto_flex.day, settings.auto_flex.hour]
is equivalent toflex='auto'
.
- Returns:
Limits rounded according to the flex-argument.
- Raises:
ValueError – For invalid limits.
Examples
>>> from pandas import Timestamp >>> limits = Timestamp("2019-05-11"), Timestamp("2019-05-11 22:05:30")
Basic usage.
>>> expand_limits(limits, flex="d") (Timestamp('2019-05-11 00:00:00'), Timestamp('2019-05-12 00:00:00'))
You may specify a maximum “distance” that limits may be expanded.
>>> expand_limits(limits, flex="d<1h") (Timestamp('2019-05-11 00:00:00'), Timestamp('2019-05-11 22:05:30'))
Limits will never be rounded in the “wrong” direction…
>>> limits = Timestamp("2019-05-11"), Timestamp("2019-05-11 11:05:30") >>> expand_limits(limits, flex="d") (Timestamp('2019-05-11 00:00:00'), Timestamp('2019-05-11 11:05:30'))
…even if you make the tolerance large enough.
>>> expand_limits(limits, flex="d<14h") (Timestamp('2019-05-11 00:00:00'), Timestamp('2019-05-11 11:05:30'))
- fold_weight(splits: list[DatetimeSplitBounds], *, unit: str | Literal['rows', 'hours', 'days'] = 'hours', available: Iterable[str | Timestamp | datetime | date | datetime64] | None = None) list[DatetimeSplitCounts] [source]#
Compute fold weights.
- Parameters:
splits – List of
DatetimeSplitBounds
.unit – Time unit of the returned count, or ‘rows’ (requires available data).
available – Available data. Required when
unit='rows'
.
- Returns:
A list of tuples
[(n_data_units, n_future_data_units), ...]
.- Raises:
ValueError – if
unit='rows'
andavailable=None
.
- to_string(bounds: str | Timestamp | datetime | date | datetime64 | DatetimeSplitBounds | tuple[str | Timestamp | datetime | date | datetime64, str | Timestamp | datetime | date | datetime64, str | Timestamp | datetime | date | datetime64], mid: str | Timestamp | datetime | date | datetime64 | None = None, end: str | Timestamp | datetime | date | datetime64 | None = None, /, *, format: str | None = None) str [source]#
Pretty-print a fold.
('2021-12-30' <= [schedule: '2022-01-04' (Tuesday)] < '2022-01-04 18:00:00')
- Parameters:
bounds – A fold tuple
(start, mid, end)
, or just start (followed by mid and end).mid – Datetime-like. Must be
None
when bounds is a tuple.end – Datetime-like. Must be
None
when bounds is a tuple.format – A custom format to use. Use
FOLD_FORMAT
ifNone
, but note that only the start, mid and end keys are available to this function.
- Returns:
Formatted bounds string.
- Raises:
TypeError – If an incorrect number of timestamps are given.