The wisdom of crowds in forecasting at high-frequency for multiple time horizons: A case study of the Brazilian retail sales.

AutorLopes, Gustavo
  1. Introduction

    According to the Brazilian Society of Retail and Consumption (SBVC), "retail" is any economic activity in which there is a sale of a good or service to the final consumer. In the composition of the Gross Domestic Product (GDP), retail is related to household consumption, which, in 2020, corresponded to 62.9% of Brazilian GDP, according to data from the system of national accounts released by the Brazilian Institute of Geography and Statistics (IBGE).

    The payments market is one of the fastest growing sectors in Brazilian retail. According to the balance sheet released by the Brazilian Association of Credit Card and Services Companies (ABECS), the total value of card transactions corresponded to R$ 2.65 trillion in 2021, growing 33.1% vs. previous year and reaching 54.6% of household consumption in the third quarter. The indexes analyzed in this study are composed by the total monetary value of card transactions captured by one Brazilian acquiring company, separated in total retail, durable goods, non-durable goods and services.

    The nature of the data allowed these indexes to be evaluated daily. In this way, events that are representative of retail calendar, such as holidays, can be analyzed isolated from ordinary sales days. This seasonality dynamic is known as calendar effect and generates an abrupt increase in the volatility of daily sales indexes.

    The study evaluated the accuracy performance of both traditional time series prediction models--Naive, Holt-Winters, TBATS and SARIMA--and popular machine leaning and artificial neural network algorithms--Multilayer Perceptron (MLP), Long-Short Term Memory (LSTM), eXtreme Gradient Boosting (XGBoost) and Facebook's Prophet model--in a forecasting setting for multiple time horizons. All these models, being Naive the only exception, are hyperparameter dependent which directly influences prediction.

    For hyperparameter selection, a simple holdout approach was used, in which the in-sample period is subdivided into two smaller sections, one for estimating the hyperparameters and another for validating them. Although authors like Arlot and Celisse (2010), Bergmeir and Benitez (2012), Bergmeir et al. (2014) and Pinto and Marcal (2021) reported good results in accuracy by forecasting time series with a cross-validation method, the consistency of results is still an open discussion in academic literature. A cross-validatory method proposed by Burman et al. (1994) pointed out that h-block cross-validation was asymptotic inconsistent with results found by Shao (1993) that this model selection method was recommended when compared to a simple leave-one-out cross-validation. Therefore, Racine (2000) proposed a modified formulation dubbed "hv-block" cross-validation, implying that the method was asymptotic consistent. However, a recent note by Zheng (2019) corrects a mistake in Racine's paper, leaving the "hv-block" cross-validation consistency an open discussion. Finally, it is worth noting that, for the Brazilian retail sales forecasting literature, authors like Pasquotto (2011), Angelo et al. (2011) and Bessa (2018) used at least one of Akaike or Schwarz information criteria for hyperparameter selection for traditional time series model such as SARIMA.

    The forecasts were constructed for four time prediction horizons: forecast for the following day (D+1), after seven consecutive days (D+7), after thirty consecutive days (D+30) and after ninety consecutive days (D+90). The prediction error metric Root Mean Squared Error (RMSE) was calculated iteratively in out-of-sample period in each of these scenarios, re-estimating the models after the end of each prediction. This process was built with the objective of simulating a real situation, in which an economic agent updates his forecasts daily.

    An open discussion in time series forecasting literature is related to the conditions in which neural networks and other machine learning algorithms are better predictors when compared to more traditional models. In the 1990s, with the advancement of literature on neural network research, Gorr (1994) and Hill et al. (1994) suggested future studies on the conditions under which an artificial neural network could overcome a traditional model. In Brazil, authors such as Pasquotto (2011), Angelo et al. (2011) and Bessa (2018) conducted forecasting studies with neural networks models for retail sectors and not all of them achieved significant accuracy results. The recent doctoral thesis by Bessa (2018) includes the utilization of XGBoost algorithm, which accuracy outperformed other models for the Brazilian fashion retail industry. Her work also reported accuracy gain by applying ensemble methods to the model comparison. This paper, therefore, evaluated her recommendation for the usage of ensemble methods in another setting: by analyzing the Brazilian retail sector as a whole and by forecasting at multiple time horizons.

    Therefore, the objectives of this case study are: (1) to evaluate the existence of a single model capable of predicting daily indexes with greater accuracy, comparing the traditional models with artificial neural networks and machine learning algorithms; (2) identify whether the results are consistent for the four sales indexes; (3) identify whether the results are consistent across prediction horizons; (4) assess the impact on accuracy in modeling the calendar effect; (4) identify whether an ensemble method generates greater forecasting results; and (5) identify whether it is possible to create a consistent forecasting strategy for Brazilian retail sales data.

    An interesting way to address the main conclusion of this study is to philosophic compare it to James Surowiecki's famous theory "Wisdom of Crowds", in which it states that the collective opinion of groups are better than a single individual: for all indexes and prediction time horizons, an ensemble blending method--utilizing all available models with calendar variables--either outperform the single best model or is not statistically different from it. Therefore, a dominant strategy for accuracy improvement for the Brazilian retail high-frequency data at multiple time horizons forecasting challenge could be found: (1) apply seasonality treatment to the time series by analyzing holidays and calendar effects; (2) generate predictions for the indexes and their log transformations using traditional models, artificial neural networks and machine learning algorithms; and, finally, (3) ensemble all of them by either taking the mean or median prediction to generate a single forecast.

    The study is divided in five sections. The next one consists of a brief review of both the retail sales forecasting literature and time series forecasting history. Section 3 details the methodology and, in Section 4, the forecasting results are presented. Finally, Section 5 summarizes the conclusion for each objective of this article.

  2. Related literature and motivation

    The retail sales forecasting topic is widely studied by academic literature, both in business administration, for supply chain optimization and strategic planning purposes, and in macroeconomics, which studies the sales aggregate for a country's economic growth dynamics.

    Authors like Geurts and Kelly (1986), Frank et al. (2003) and Au et al. (2008) studied specific sectors in retail, describing the results achieved when comparing different forecasting models. For instance, Geurts and Kelly (1986) found that an exponencial smoothing model could generate better forecasting results than a SARIMA model. However, Frank et al. (2003) and Au et al. (2008) achieved better results for the fashion industry using artificial neural networks (ANNs).

    By analyzing the forecasting problem for the retail sales aggregate of an economy, Alon et al. (2001) discovered interesting results using artificial neural networks when compared to traditional ARIMA and exponential smoothing models. The authors assigned the results to the fact that neural networks were able to capture non-linear patterns in the time series, thus giving better accuracy results. Afterwards, authors like Chu and Zhang (2003) and Aye et al. (2015) also compared the performance of ANNs with traditional models. Chu and Zhang (2003) could achieve better MAPE and RMSE results using ANN on deseasonalized data and Aye et al. (2015) best result was obtained by combining different models for forecasting. Interestingly, tough, is that Aye et al. (2015) also studied forecasting in different prediction horizons, obtaining similar results in all of them.

    Using Brazilian's retail sales data, Almeida and Passari (2006) compared forecasting models for monthly SKU's sales data from a large retail corporation. The authors achieved better forecasting results in RMSE and MAPE using ANNs and they assigned these results to the fact that ANN models where capable of understanding interactions between each product, simulating demand's cross elasticity. Bessa (2018) also obtained interesting results by using the LSTM neural network model on a weekly sales data from a specific retailer in the fashion industry. Her results were up to 54.32% of accuracy increase, when comparing the LSTM performance with the forecasting method that was used by the retailer company.

    However, not all authors in Brazil could achieve similar results. For instance, Pasquotto (2011) studied monthly aggregated sales from pharmaceutics, fertilizer and air traffic sector using Elman Networks, one specific architecture of neural network, and this model was not able to outperform a simple SARIMA model. Angelo et al. (2011) also studied forecasting models using aggregated retail sales from Brazilian's Monthly Survey of Trade (PMC) and could not achieve expressive forecasting gain using neural networks.

    Some noteworthy papers about forecasting and model hyperparameter selection literature also include the studies of Swanson and White (1997), Hippert et al. (2005)...

Para continuar a ler

PEÇA SUA AVALIAÇÃO

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT