rics.ml.time_split.support#
Supporting functions.
These functions are used internally, but are exposed here as well so that user may create their own logic using the internal logic, or just to test things out.
Warning
Not part of the stable API.
This module may change without notice. Stick to the top-level rics.ml.time_split-module, or lock down your
dependencies if you need to use the support module.
Functions
|
Derive the "real" bounds of limits. |
|
Compute fold weights. |
Pretty-print a fold. |
- expand_limits(limits: Tuple[Timestamp, Timestamp], *, flex: bool | Literal['auto'] | str | Tuple[str | Timedelta | timedelta | timedelta64, str | Timedelta | timedelta | timedelta64, str | Timedelta | timedelta | timedelta64] | Iterable[Tuple[str | Timedelta | timedelta | timedelta64, str | Timedelta | timedelta | timedelta64, str | Timedelta | timedelta | timedelta64]]) Tuple[Timestamp, Timestamp][source]#
Derive the “real” bounds of limits.
Flex options.# Type
Description
Trueor'auto'Auto-flex using
auto_flex-settings.FalseDo nothing; return limits unchanged.
strA string round_to or round_to<tolerance, where round_to is the desired frequency of the limits and tolerance is the maximum amount by which to change the input limits.
list[tuple]Passing tuples
(start_at, round_to, tolerance)will use the largest tuple such thatstart_at > >= limits[1] - limits[0]. Other parameters are interpreted as above.tupleLike
list[tuple], but with just one level.Note
Passing
flex=[auto_flex.day, auto_flex.hour]is equivalent toflex='auto'.- Parameters:
limits – A tuple
(lo, hi)of timestamps.flex – See the table above.
- Returns:
Limits rounded according to the flex-argument.
- Raises:
ValueError – For invalid limits.
Examples
>>> from pandas import Timestamp >>> limits = Timestamp("2019-05-11"), Timestamp("2019-05-11 22:05:30")
Basic usage.
>>> expand_limits(limits, flex="d") (Timestamp('2019-05-11 00:00:00'), Timestamp('2019-05-12 00:00:00'))
You may specify a maximum “distance” that limits may be expanded.
>>> expand_limits(limits, flex="d<1h") (Timestamp('2019-05-11 00:00:00'), Timestamp('2019-05-11 22:05:30'))
Limits will never be rounded in the “wrong” direction..
>>> limits = Timestamp("2019-05-11"), Timestamp("2019-05-11 11:05:30") >>> expand_limits(limits, flex="d") (Timestamp('2019-05-11 00:00:00'), Timestamp('2019-05-11 11:05:30'))
…even if you make the tolerance large enough.
>>> expand_limits(limits, flex="d<14h") (Timestamp('2019-05-11 00:00:00'), Timestamp('2019-05-11 11:05:30'))
- fold_weight(splits: List[DatetimeSplitBounds], *, unit: str | Literal['rows', 'hours', 'days'] = 'hours', available: Iterable[str | Timestamp | datetime | date | datetime64] = None) List[DatetimeSplitCounts][source]#
Compute fold weights.
- Parameters:
splits – List of
DatetimeSplitBounds.unit – Time unit of the returned count, or ‘rows’ (requires available data).
available – Available data. Required when
unit='rows'.
- Returns:
A list of tuples
[(n_data_units, n_future_data_units), ...].- Raises:
ValueError – if
unit='rows'andavailable=None.
- to_string(bounds: str | Timestamp | datetime | date | datetime64 | DatetimeSplitBounds | Tuple[str | Timestamp | datetime | date | datetime64, str | Timestamp | datetime | date | datetime64, str | Timestamp | datetime | date | datetime64], mid: str | Timestamp | datetime | date | datetime64 | None = None, end: str | Timestamp | datetime | date | datetime64 | None = None, /, *, format: str = None) str[source]#
Pretty-print a fold.
Sample output.#('2021-12-30' <= [schedule: '2022-01-04' (Tuesday)] < '2022-01-04 18:00:00')
- Parameters:
bounds – A fold tuple
(start, mid, end), or just start (followed by mid and end).mid – Datetime-like. Must be
Nonewhen bounds is a tuple.end – Datetime-like. Must be
Nonewhen bounds is a tuple.format – A custom format to use. Use
FOLD_FORMATifNone, but note that only the start, mid and end keys are available to this function.
- Returns:
Formatted bounds string.
- Raises:
TypeError – If an incorrect number of timestamps are given.