WSEAS Transactions on Business and Economics
Print ISSN: 1109-9526, E-ISSN: 2224-2899
Volume 21, 2024
Multiple Time Series Modeling of Autoregressive Distributed Lags with Forward Variable Selection for Prediction
Authors: , , , , ,
Abstract: The conventional time series methods tend to explore the modeling process and statistics tests to find the best model. On the other hand, machine learning methods are concerned with finding it based on the highest performance in the testing data. This research proposes a mixture approach in the development of the ARDL (Autoregressive Distributed Lags) model to predict the Cayenne peppers price. Multiple time series data are formed into a matrix of input-output pairs with various lag numbers of 3, 5, and 7. The dataset is normalized with the Min-max and Z score transformations. The ARDL predictor variables of each lag number and dataset combinations are selected using the forward selection method with a majority vote of four criteria namely the Cp (Cp Mallow), AIC (Akaike Information Criterion), BIC (Bayesian Information Criterion), and adjusted $$R^2$$. Each ARDL model is evaluated in the testing data with performance metrics of the RMSE (Root Mean Square Error), MAE (Mean Absolute Error), and $$R^2$$. Both AIC and adjusted R2 always form the majority vote in the determining optimal predictor variable of ARDL models in all scenarios. The ARDL predictor variables in each lag number are different but they are the same in the different dataset scenarios. The price of Cayenne pepper yesterday is the predictor variable with the most contribution in all of the 9 ARDL models yielded. The ARDL lag 3 with the original dataset outperforms in the RMSE and MAE metrics while the ARDL lag 3 with the Z score dataset outperforms in the $$R^2$$ metric.
Search Articles
Keywords: ARDL model, time series, autoregressive, forward selection, machine learning, normalization method, prediction
Pages: 1012-1026
DOI: 10.37394/23207.2024.21.84