• Stars
    star
    263
  • Rank 154,738 (Top 4 %)
  • Language
    Python
  • License
    MIT License
  • Created about 6 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Scikit-learn style cross-validation classes for time series data

timeseriescv

This package implements two cross-validation algorithms suitable to evaluate machine learning models based on time series datasets where each sample is tagged with a prediction time and an evaluation time.

Resources

Installation

timeseriescv can be installed using pip:

>>> pip install timeseriescv

Content

For now the package contains two main classes handling cross-validation:

  • PurgedWalkForwardCV: Walk-forward cross-validation with purging.
  • CombPurgedKFoldCV: Combinatorial cross-validation with purging and embargoing.

Remarks concerning the API

The API is as similar to the scikit-learn API as possible. Like the scikit-learn cross-validation classes, the split method is a generator that yields a pair of numpy arrays containing the positional indices of the samples in the train and validation set, respectively. The main differences with the scikit-learn API are:

  • The split method takes as arguments not only the predictor values X, but also the prediction times pred_times and the evaluation times eval_times of each sample.
  • To stay as close to the scikit-learn API as possible, this data is passed as separate parameters. But in order to ensure that they are properly aligned, X, pred_times and eval_times are required to be pandas DataFrames/Series sharing the same index.

Check the docstrings of the cross-validation classes for more information.