TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Follow publication

Auto Model Specification with ML Techniques for Time Series

How to automatically select the best trend, seasonal, and autoregressive representations for time series using the Python library, scalecast

Michael Keith
TDS Archive
Published in
6 min readOct 4, 2022
Photo by Jake Hills on Unsplash

Several methods exist to find the best model specification for time series, depending on the model being employed. For ARIMA models, a popular method is to monitor information criteria while searching through different AR, I, and MA orders. This has proven to be an effective technique and popular libraries in R and Python offer auto-ARIMA models for users to experiment with. Similar methods can be used for other classically statistical time series methods, such as Holt-Winters Exponential Smoothing and TBATS.

For machine learning models, it can be slightly more complicated, and other than complex deep learning models (such as N-Beats, N-HiTS, and a few others), there aren’t many automated pure ML methods that consistently out-perform the classical models (Makridakis et al., 2020).

The Python library scalecast offers a function called auto_Xvar_select() that can be used to automatically select the best trend, seasonality, and look-back representations (or lags) for any given series using models from scikit-learn.

pip install --upgrade scalecast

The function works by first searching for the ideal representations of the time series’ given trend, then seasonality, then look-back, all separately. “Ideal” in this context means minimizing some out-of-sample error (or maximizing R2) with a selected model (multiple linear regression, or MLR, by default). After each of these has been found separately, the ideal combination of all of the above representations is searched for, with the option to consider irregular cycles and other regressors as the user sees fit.

Image by author — how scalecast automatically selects regressors for forecasting models using the auto_Xvar_select() function

It is an interesting function. When applied to the 100,000 series from the M4 competition, it returns results with varying accuracy, depending on the series’ frequency. For the hourly frequency group, an OWA of under 0.6 using each of the KNN, LightGBM…

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Michael Keith
Michael Keith

Written by Michael Keith

Data Scientist and Python developer. Check out scalecast: https://github.com/mikekeith52/scalecast

No responses yet