Home DATA SCIENCE Introducing Modeltime Ensemble: Time Series Forecast Stacking

Introducing Modeltime Ensemble: Time Series Forecast Stacking

Written by Matt Dancho on October 13, 2020


I’m SUPER EXCITED to introduce modeltime.ensemble, the time sequence ensemble extension to modeltime. This tutorial (view original article) introduces our new R Package, Modeltime Ensemble, which makes it straightforward to carry out stacked forecasts that enhance forecast accuracy. If you want what you see, I’ve an Advanced Time Series Course the place you’ll grow to be the time-series professional on your group by studying modeltime, modeltime.ensemble, and timetk.

Forecasting and Time Series Software Announcements

Articles on the modeltime and timetk forecasting and time sequence ecosystem.


Like these articles?

👉 Register to stay in the know
👈

on new cutting-edge R software program like modeltime.


Three months in the past I launched modeltime, a brand new R bundle that speeds up forecasting experimentation and model selection with Machine Learning (e.g. XGBoost, GLMNET, Prophet, Prophet Boost, ARIMA, and ARIMA Boost).

Fast-forward to now. I’m thrilled to announce the primary extension to Modeltime: modeltime.ensemble.

Modeltime Ensemble is a cutting-edge bundle that integrates Three competition-winning time sequence ensembling methods:

  1. Super Learners (Meta-Learners): Use modeltime_fit_resamples() and ensemble_model_spec() to create tremendous learners (fashions that study from the predictions of sub-models)

  2. Weighted Ensembles: Use ensemble_weighted() to create weighted ensembles.

  3. Average Ensembles: Use ensemble_average() to construct easy common and median ensembles.

High-Performance Forecasting Stacks

Using these modeltime.ensemble, you may construct high-performance forecasting stacks. Here’s a Multi-Level Stack, which gained the Kaggle Grupo Bimbo Inventory Demand Forecasting Competition (I train this method in my High-Performance Time Series Forecasting Course).

The Multi-Level Stacked Ensemble that gained the Kaggle Grupo Bimbo Inventory Demand Challenge

Today, I’ll cowl forecasting Product Sales with Average and Weighted Ensembles, that are quick to implement and might have good efficiency (though super-learner’s are inclined to have higher efficiency).

Weighted Stacking with Modeltime Ensemble

Weighted Stacking with Modeltime Ensemble

Ensemble Key Concepts:

The thought is that we have now a number of sub-models (Level 1) that make predictions. We can then take these predictions and mix them utilizing a easy common (imply), median common, or a weighted common:

  • Simple Average: Weights all fashions with the identical proportion. Selects the common for every timestamp. Use ensemble_average(sort = "mean").
  • Median Average: No weighting. Selects prediction utilizing the centered worth for every time stamp. Use ensemble_average(sort = "median").
  • Weighted Average: User defines the weights (loadings). Applies a weighted common at every of the timestamps. Use ensemble_weighted(loadings = c(1, 2, 3, 4)).

More Advanced Ensembles:

The common and weighted ensembles are the only approaches to ensembling. One technique that Modeltime Ensemble has built-in is Super Learners. We gained’t cowl these on this tutorial. But, I train them in my High-Performance Time Series Course. 💪

Install modeltime.ensemble.

set up.packages("modeltime.ensemble")

Load the next libraries.

# Time Series Modeling and Machine Learning
library(tidymodels)
library(modeltime)
library(modeltime.ensemble)
# Time Series and Data Wrangling
library(timetk)
library(tidyverse)

Our Business goal is to forecast the following 12-weeks of Product Sales given 2-year gross sales historical past.

We’ll begin with a walmart_sales_weekly time sequence information set that features Walmart Product Transactions from a number of shops, which is a small pattern of the dataset from Kaggle Walmart Recruiting – Store Sales Forecasting. We’ll simplify the info set to a univariate time sequence with columns, “Date” and “Weekly_Sales” from Store 1 and Department 1.

store_1_1_tbl <- walmart_sales_weekly %>%
    filter(id == "1_1") %>%
    choose(Date, Weekly_Sales)
store_1_1_tbl
## # A tibble: 143 x 2
##    Date       Weekly_Sales
##    <date>            <dbl>
##  1 2010-02-05       24924.
##  2 2010-02-12       46039.
##  3 2010-02-19       41596.
##  4 2010-02-26       19404.
##  5 2010-03-05       21828.
##  6 2010-03-12       21043.
##  7 2010-03-19       22137.
##  8 2010-03-26       26229.
##  9 2010-04-02       57258.
## 10 2010-04-09       42961.
## # … with 133 extra rows

Next, visualize the dataset with the plot_time_series() operate. Toggle .interactive = TRUE to get a plotly interactive plot. FALSE returns a ggplot2 static plot.

store_1_1_tbl %>%
    plot_time_series(Date, Weekly_Sales, .smooth_period = "3 months", .interactive = FALSE)

plot of chunk unnamed-chunk-4

Let’s do a fast seasonality analysis to hone in on necessary options utilizing plot_seasonal_diagnostics().

store_1_1_tbl %>%
    plot_seasonal_diagnostics(
        Date, Weekly_Sales,
        .feature_set = c("week", "month.lbl"),
        .interactive = FALSE
    )

plot of chunk unnamed-chunk-5

We can see that sure weeks and months of the yr have greater gross sales. These anomalies are doubtless as a result of occasions. The Kaggle Competition knowledgeable opponents that Super Bowl, Labor Day, Thanksgiving, and Christmas have been particular holidays. To approximate the occasions, week quantity and month could also be good options. Let’s come again to this once we preprocess our information.

Give the target to forecast 12 weeks of product gross sales, we use time_series_split() to make a practice/take a look at set consisting of 12-weeks of take a look at information (maintain out) and the remainder for coaching.

  • Setting assess = "12 weeks" tells the operate to make use of the final 12-weeks of information because the testing set.
  • Setting cumulative = TRUE tells the sampling to make use of the entire prior information because the coaching set.
splits <- store_1_1_tbl %>%
    time_series_split(assess = "12 weeks", cumulative = TRUE)

Next, visualize the practice/take a look at cut up.

  • tk_time_series_cv_plan(): Converts the splits object to a knowledge body
  • plot_time_series_cv_plan(): Plots the time sequence sampling information utilizing the “date” and “value” columns.
splits %>%
    tk_time_series_cv_plan() %>%
    plot_time_series_cv_plan(Date, Weekly_Sales, .interactive = FALSE)

plot of chunk unnamed-chunk-7

We’ll make quite a few calendar options utilizing recipes. Most of the heavy lifting is finished by timetk::step_timeseries_signature(), which generates a sequence of widespread time sequence options. We take away those that gained’t assist. After dummying we have now 74 complete columns, 72 of that are engineered calendar options.

recipe_spec <- recipe(Weekly_Sales ~ Date, store_1_1_tbl) %>%
    step_timeseries_signature(Date) %>%
    step_rm(matches("(iso$)|(xts$)|(day)|(hour)|(min)|(sec)|(am.pm)")) %>%
    step_mutate(Date_week = issue(Date_week, ordered = TRUE)) %>%
    step_dummy(all_nominal()) %>%
    step_normalize(accommodates("index.num"), Date_year)
recipe_spec %>% prep() %>% juice()
## # A tibble: 143 x 74
##    Date       Weekly_Sales Date_index.num Date_year Date_half Date_quarter
##    <date>            <dbl>          <dbl>     <dbl>     <int>        <int>
##  1 2010-02-05       24924.          -1.71     -1.21         1            1
##  2 2010-02-12       46039.          -1.69     -1.21         1            1
##  3 2010-02-19       41596.          -1.67     -1.21         1            1
##  4 2010-02-26       19404.          -1.64     -1.21         1            1
##  5 2010-03-05       21828.          -1.62     -1.21         1            1
##  6 2010-03-12       21043.          -1.59     -1.21         1            1
##  7 2010-03-19       22137.          -1.57     -1.21         1            1
##  8 2010-03-26       26229.          -1.54     -1.21         1            1
##  9 2010-04-02       57258.          -1.52     -1.21         1            2
## 10 2010-04-09       42961.          -1.50     -1.21         1            2
## # … with 133 extra rows, and 68 extra variables: Date_month <int>,
## #   Date_mweek <int>, Date_week2 <int>, Date_week3 <int>, Date_week4 <int>,
## #   Date_month.lbl_01 <dbl>, Date_month.lbl_02 <dbl>, Date_month.lbl_03 <dbl>,
## #   Date_month.lbl_04 <dbl>, Date_month.lbl_05 <dbl>, Date_month.lbl_06 <dbl>,
## #   Date_month.lbl_07 <dbl>, Date_month.lbl_08 <dbl>, Date_month.lbl_09 <dbl>,
## #   Date_month.lbl_10 <dbl>, Date_month.lbl_11 <dbl>, Date_week_01 <dbl>,
## #   Date_week_02 <dbl>, Date_week_03 <dbl>, Date_week_04 <dbl>,
## #   Date_week_05 <dbl>, Date_week_06 <dbl>, Date_week_07 <dbl>,
## #   Date_week_08 <dbl>, Date_week_09 <dbl>, Date_week_10 <dbl>,
## #   Date_week_11 <dbl>, Date_week_12 <dbl>, Date_week_13 <dbl>,
## #   Date_week_14 <dbl>, Date_week_15 <dbl>, Date_week_16 <dbl>,
## #   Date_week_17 <dbl>, Date_week_18 <dbl>, Date_week_19 <dbl>,
## #   Date_week_20 <dbl>, Date_week_21 <dbl>, Date_week_22 <dbl>,
## #   Date_week_23 <dbl>, Date_week_24 <dbl>, Date_week_25 <dbl>,
## #   Date_week_26 <dbl>, Date_week_27 <dbl>, Date_week_28 <dbl>,
## #   Date_week_29 <dbl>, Date_week_30 <dbl>, Date_week_31 <dbl>,
## #   Date_week_32 <dbl>, Date_week_33 <dbl>, Date_week_34 <dbl>,
## #   Date_week_35 <dbl>, Date_week_36 <dbl>, Date_week_37 <dbl>,
## #   Date_week_38 <dbl>, Date_week_39 <dbl>, Date_week_40 <dbl>,
## #   Date_week_41 <dbl>, Date_week_42 <dbl>, Date_week_43 <dbl>,
## #   Date_week_44 <dbl>, Date_week_45 <dbl>, Date_week_46 <dbl>,
## #   Date_week_47 <dbl>, Date_week_48 <dbl>, Date_week_49 <dbl>,
## #   Date_week_50 <dbl>, Date_week_51 <dbl>, Date_week_52 <dbl>

Now for the enjoyable half! Let’s make some fashions utilizing capabilities from modeltime and parsnip.

Auto ARIMA

Here’s the fundamental Auto ARIMA Model.

  • Model Spec: arima_reg() <– This units up your basic mannequin algorithm and key parameters
  • Set Engine: set_engine("auto_arima") <– This selects the particular package-function to make use of and you may add any function-level arguments right here.
  • Fit Model: match(Weekly_Sales ~ Date, coaching(splits)) <– All Modeltime Models require a date column to be a regressor.
model_fit_arima <- arima_reg(seasonal_period = 52) %>%
    set_engine("auto_arima") %>%
    match(Weekly_Sales ~ Date, coaching(splits))
model_fit_arima
## parsnip mannequin object
## 
## Fit time:  206ms 
## Series: final result 
## ARIMA(0,0,1)(0,1,0)[52] 
## 
## Coefficients:
##          ma1
##       0.6704
## s.e.  0.0767
## 
## sigma^2 estimated as 60063672:  log chance=-819.37
## AIC=1642.74   AICc=1642.9   BIC=1647.48

Elastic Net

Making an Elastic NET mannequin is simple to do. Just arrange your mannequin spec utilizing linear_reg() and set_engine("glmnet"). Note that we have now not fitted the mannequin but (as we did in earlier steps).

model_spec_glmnet <- linear_reg(penalty = 0.01, combination = 0.5) %>%
    set_engine("glmnet")

Next, make a fitted workflow:

  • Start with a workflow()
  • Add a Model Spec: add_model(model_spec_glmnet)
  • Add Preprocessing: add_recipe(recipe_spec %>% step_rm(date)) <– Note that I’m eradicating the “date” column since Machine Learning algorithms don’t usually know learn how to cope with date or date-time options
  • Fit the Workflow: match(coaching(splits))
wflw_fit_glmnet <- workflow() %>%
    add_model(model_spec_glmnet) %>%
    add_recipe(recipe_spec %>% step_rm(Date)) %>%
    match(coaching(splits))

XGBoost

We can match a XGBoost Model utilizing the same course of because the Elastic Net.

model_spec_xgboost <- boost_tree() %>%
    set_engine("xgboost")
set.seed(123)
wflw_fit_xgboost <- workflow() %>%
    add_model(model_spec_xgboost) %>%
    add_recipe(recipe_spec %>% step_rm(Date)) %>%
    match(coaching(splits))

NNETAR

We can use a NNETAR mannequin. Note that add_recipe() makes use of the complete recipe (with the Date column) as a result of this can be a Modeltime Model.

model_spec_nnetar <- nnetar_reg(
        seasonal_period = 52,
        non_seasonal_ar = 4,
        seasonal_ar     = 1
    ) %>%
    set_engine("nnetar")
set.seed(123)
wflw_fit_nnetar <- workflow() %>%
    add_model(model_spec_nnetar) %>%
    add_recipe(recipe_spec) %>%
    match(coaching(splits))

Prophet w/ Regressors

We’ll construct a Prophet Model with Regressors. This makes use of the Facebook Prophet forecasting algorithm and provides the entire 72 options as regressors to the mannequin. Note – Because this can be a Modeltime Model we have to have a Date Feature within the recipe.

model_spec_prophet <- prophet_reg(
      seasonality_yearly = TRUE
    ) %>%
    set_engine("prophet") 
wflw_fit_prophet <- workflow() %>%
    add_model(model_spec_prophet) %>%
    add_recipe(recipe_spec) %>%
    match(coaching(splits))

Let’s check out our progress thus far. We have 5 fashions. We’ll put them right into a Modeltime Table to prepare them utilizing modeltime_table().

submodels_tbl <- modeltime_table(
    model_fit_arima,
    wflw_fit_glmnet,
    wflw_fit_xgboost,
    wflw_fit_nnetar,
    wflw_fit_prophet
)
submodels_tbl
## # Modeltime Table
## # A tibble: 5 x 3
##   .model_id .mannequin     .model_desc            
##       <int> <record>     <chr>                  
## 1         1 <match[+]>   ARIMA(0,0,1)(0,1,0)[52]
## 2         2 <workflow> GLMNET                 
## 3         3 <workflow> XGBOOST                
## 4         4 <workflow> NNAR(4,1,10)[52]       
## 5         5 <workflow> PROPHET W/ REGRESSORS

We can get the accuracy on the hold-out set utilizing modeltime_accuracy() and table_modeltime_accuracy(). The finest mannequin is the Prophet with Regressors with a MAE of 1031.

submodels_tbl %>% 
    modeltime_accuracy(testing(splits)) %>%
    table_modeltime_accuracy(.interactive = FALSE)

.model_id.model_desc.sortmaemapemasesmapermsersq
1ARIMA(0,0,1)(0,1,0)[52]Test1359.996.771.026.931721.470.95
2GLMNETTest1222.386.470.916.731349.880.98
3XGBOOSTTest1089.565.220.825.201266.620.96
4NNAR(4,1,10)[52]Test2529.9211.681.8910.733507.550.93
5PROPHET W/ REGRESSORSTest1031.535.130.775.221226.800.98

And, we will visualize the forecasts with modeltime_forecast() and plot_modeltime_forecast().

submodels_tbl %>%
    modeltime_forecast(
        new_data    = testing(splits),
        actual_data = store_1_1_tbl
    ) %>%
    plot_modeltime_forecast(.interactive = FALSE)

plot of chunk unnamed-chunk-17

We’ll make Average, Median, and Weighted Ensembles. If you have an interest in making Super Learners (Meta-Learner Models that leverage sub-model predictions), I train this in my new High-Performance Time Series course.

I’ve made it tremendous easy to construct an ensemble from a Modeltime Tables. Here’s learn how to use ensemble_average().

  • Start together with your Modeltime Table of Sub-Models
  • Pipe into ensemble_average(sort = "mean")

You now have a fitted common ensemble.

# Simple Average Ensemble
ensemble_fit_avg <- submodels_tbl %>%
    ensemble_average(sort = "mean")
ensemble_fit_avg
## ── Modeltime Ensemble ───────────────────────────────────────────
## Ensemble of 5 Models (MEAN)
## 
## # Modeltime Table
## # A tibble: 5 x 3
##   .model_id .mannequin     .model_desc            
##       <int> <record>     <chr>                  
## 1         1 <match[+]>   ARIMA(0,0,1)(0,1,0)[52]
## 2         2 <workflow> GLMNET                 
## 3         3 <workflow> XGBOOST                
## 4         4 <workflow> NNAR(4,1,10)[52]       
## 5         5 <workflow> PROPHET W/ REGRESSORS

We could make median and weighted ensembles simply as simply. Note – For the weighted ensemble I’m loading the higher performing fashions greater.

# Simple Median Ensemble
ensemble_fit_med <- submodels_tbl %>%
    ensemble_average("median")
# Higher Loading on Better Models (Test RMSE)
ensemble_fit_wt <- submodels_tbl %>%
    ensemble_weighted(loadings = c(2, 4, 6, 1, 6))

We have to have Modeltime Tables that arrange our ensembles earlier than we will assess efficiency. Just use modeltime_table() to prepare ensembles similar to we did for the Sub-Models.

ensemble_models_tbl <- modeltime_table(
    ensemble_fit_avg,
    ensemble_fit_med,
    ensemble_fit_wt
)
ensemble_models_tbl
## # Modeltime Table
## # A tibble: Three x 3
##   .model_id .mannequin         .model_desc                  
##       <int> <record>         <chr>                        
## 1         1 <ensemble [5]> ENSEMBLE (MEAN): 5 MODELS    
## 2         2 <ensemble [5]> ENSEMBLE (MEDIAN): 5 MODELS  
## 3         3 <ensemble [5]> ENSEMBLE (WEIGHTED): 5 MODELS

Let’s try the Accuracy Table utilizing modeltime_accuracy() and table_modeltime_accuracy().

  • From MAE, Ensemble Model ID 1 has 1000 MAE, a 3% enchancment over our greatest submodel (MAE 1031).
  • From RMSE, Ensemble Model ID Three has 1228, which is on par with our greatest submodel.
ensemble_models_tbl %>%
    modeltime_accuracy(testing(splits)) %>%
    table_modeltime_accuracy(.interactive = FALSE)

.model_id.model_desc.sortmaemapemasesmapermsersq
1ENSEMBLE (MEAN): 5 MODELSTest1000.014.630.754.581408.680.97
2ENSEMBLE (MEDIAN): 5 MODELSTest1146.605.680.865.771310.300.98
3ENSEMBLE (WEIGHTED): 5 MODELSTest1056.595.150.795.201228.450.98

And lastly we will visualize the efficiency of the ensembles.

ensemble_models_tbl %>%
    modeltime_forecast(
        new_data    = testing(splits),
        actual_data = store_1_1_tbl
    ) %>%
    plot_modeltime_forecast(.interactive = FALSE)

plot of chunk unnamed-chunk-22

The modeltime.ensemble bundle performance is rather more feature-rich than what we’ve coated right here (I couldn’t probably cowl every little thing on this submit). 😀

Here’s what I didn’t cowl:

  • Super-Learners: We could make use resample predictions from our sub-models as inputs to a meta-learner. This can result’s considerably higher accuracy (5% enchancment is what we obtain in my Time Series Course).

  • Multi-Level Modeling: This is the technique that gained the Grupo Bimbo Inventory Demand Forecasting Challenge the place a number of layers of esembles are used.

  • Refitting Sub-Models and Meta-Learners: Refitting is particular process that’s wanted previous to forecasting future information. Refitting requires cautious consideration to regulate the sub-model and meta-learner retraining course of.

I train every of those strategies and methods so that you grow to be the time sequence professional on your group. Here’s how. 👇

Advanced Time Series Course
Become the instances sequence area professional in your group.

Make positive you’re notified when my new Advanced Time Series Forecasting in R course comes out. You’ll study timetk and modeltime plus essentially the most highly effective time sequence forecasting techiniques accessible. Become the instances sequence area professional in your group.

👉 Advanced Time Series Course.

You will study:

  • Time Series Preprocessing, Noise Reduction, & Anomaly Detection
  • Feature Engineering utilizing lagged variables & exterior regressors
  • Hyperparameter Tuning
  • Time Series Cross-Validation
  • Ensembling Multiple Machine Learning & Univariate Modeling Techniques (Competition Winner)
  • NEW – Deep Learning with GluonTS (Competition Winner)
  • and extra.

Unlock the High-Performance Time Series Course

More info on the modeltime ecosystem might be discovered within the software program documentation Modeltime, Modeltime Ensemble, and
Timetk.

Make a remark within the chat under. 👇

And, for those who plan on utilizing modeltime.ensemble for what you are promoting, it’s a no brainer – Take my Time Series Course.

LEAVE A REPLY

Please enter your comment!
Please enter your name here