rics.ml.time_split.integration.split_data#
Base implementations for splitting generic data types.
Users may implement splitting of any data type by implementing suitable as_available
and select
functions.
Module Attributes
Type of data to split. |
|
A callable |
|
A callable |
Functions
|
Base implementation for splitting integrated data types. |
Classes
|
Time-based split of a generic data type. |
- class DataT#
Type of data to split.
alias of TypeVar(‘DataT’)
- DataAsAvailableFn#
A callable
(data: DataT) -> DatetimeIterable
.alias of
Callable
[[DataT
],Iterable
[str
|Timestamp
|datetime
|date
|datetime64
]]
- DataSelectFn#
A callable
(data: DataT, left_inclusive: datetime, end_exclusive: datetime) -> DataT)
.
- class DatetimeSplit(data: DataT, future_data: DataT, bounds: DatetimeSplitBounds)[source]#
Bases:
NamedTuple
,Generic
[DataT
]Time-based split of a generic data type.
- bounds: DatetimeSplitBounds#
The underlying bounds that produced this split.
- split_data(data: DataT, *, log_progress: str | bool | dict[str, Any] | Logger | LoggerAdapter = False, as_available: Callable[[DataT], Iterable[str | Timestamp | datetime | date | datetime64]], select: Callable[[DataT, datetime, datetime], DataT], **kwargs: Unpack[DatetimeIndexSplitterKwargs]) Iterable[DatetimeSplit[DataT]] [source]#
Base implementation for splitting integrated data types.
The required
as_available
andselect
callables provided perform the actual integration.- Parameters:
data – The data to split.
log_progress – Controls logging of fold progress. See
log_split_progress()
for details.as_available – A callable
(data: DataT) -> DatetimeIterable
.select – A callable
(data: DataT, left_inclusive: datetime, end_exclusive: datetime) -> DataT)
.**kwargs – Keyword arguments for
split()
-function.
- Yields:
Tuples
(data, future_data, bounds)
.
See also
To get started with your own integration, copy
split_pandas()
orsplit_polars()
and use it as the baseline (click[source]
) on the linked function.