The M5 competition follows the previous four M competitions, whose purpose is to learn from empirical evidence how to improve forecasting performance and advance the theory and practice of forecasting. M5 focused on a retail sales forecasting application with the objective to produce the most accurate point forecasts for 42,840 time series that represent the hierarchical unit sales of the largest retail company in the world, Walmart, as well as to provide the most accurate estimates of the uncertainty of these forecasts. Hence, the competition consisted of two parallel challenges, namely the Accuracy and Uncertainty forecasting competitions. M5 extended the results of the previous M competitions by: (a) significantly expanding the number of participating methods, especially those in the category of machine learning; (b) evaluating the performance of the uncertainty distribution along with point forecast accuracy; (c) including exogenous/explanatory variables in addition to the time series data; (d) using grouped, correlated time series; and (e) focusing on series that display intermittency. This paper describes the background, organization, and implementations of the competition, and it presents the data used and their characteristics. Consequently, it serves as introductory material to the results of the two forecasting challenges to facilitate their understanding.  相似文献   

In this study, we present the results of the M5 “Accuracy” competition, which was the first of two parallel challenges in the latest M competition with the aim of advancing the theory and practice of forecasting. The main objective in the M5 “Accuracy” competition was to accurately predict 42,840 time series representing the hierarchical unit sales for the largest retail company in the world by revenue, Walmart. The competition required the submission of 30,490 point forecasts for the lowest cross-sectional aggregation level of the data, which could then be summed up accordingly to estimate forecasts for the remaining upward levels. We provide details of the implementation of the M5 “Accuracy” challenge, as well as the results and best performing methods, and summarize the major findings and conclusions. Finally, we discuss the implications of these findings and suggest directions for future research.  相似文献   

This paper describes the M5 “Uncertainty” competition, the second of two parallel challenges of the latest M competition, aiming to advance the theory and practice of forecasting. The particular objective of the M5 “Uncertainty” competition was to accurately forecast the uncertainty distributions of the realized values of 42,840 time series that represent the hierarchical unit sales of the largest retail company in the world by revenue, Walmart. To do so, the competition required the prediction of nine different quantiles (0.005, 0.025, 0.165, 0.250, 0.500, 0.750, 0.835, 0.975, and 0.995), that can sufficiently describe the complete distributions of future sales. The paper provides details on the implementation and execution of the M5 “Uncertainty” competition, presents its results and the top-performing methods, and summarizes its major findings and conclusions. Finally, it discusses the implications of its findings and suggests directions for future research.  相似文献   

We review the results of six forecasting competitions based on the online data science platform Kaggle, which have been largely overlooked by the forecasting community. In contrast to the M competitions, the competitions reviewed in this study feature daily and weekly time series with exogenous variables, business hierarchy information, or both. Furthermore, the Kaggle data sets all exhibit higher entropy than the M3 and M4 competitions, and they are intermittent.In this review, we confirm the conclusion of the M4 competition that ensemble models using cross-learning tend to outperform local time series models and that gradient boosted decision trees and neural networks are strong forecast methods. Moreover, we present insights regarding the use of external information and validation strategies, and discuss the impacts of data characteristics on the choice of statistics or machine learning methods. Based on these insights, we construct nine ex-ante hypotheses for the outcome of the M5 competition to allow empirical validation of our findings.  相似文献   

We participated in the M4 competition for time series forecasting and here describe our methods for forecasting daily time series. We used an ensemble of five statistical forecasting methods and a method that we refer to as the correlator. Our retrospective analysis using the ground truth values published by the M4 organisers after the competition demonstrates that the correlator was responsible for most of our gains over the naïve constant forecasting method. We identify data leakage as one reason for its success, due partly to test data selected from different time intervals, and partly to quality issues with the original time series. We suggest that future forecasting competitions should provide actual dates for the time series so that some of these leakages could be avoided by participants.  相似文献   

The M5 Forecasting Competition, the fifth in the series of forecasting competitions organized by Professor Spyros Makridakis and the Makridakis Open Forecasting Center at the University of Nicosia, was an extremely successful event. This competition focused on both the accuracy and uncertainty of forecasts and leveraged actual historical sales data provided by Walmart. This has led to the M5 being a unique competition that closely parallels the difficulties and challenges associated with industrial applications of forecasting. Like its precursor the M4, many interesting ideas came from the results of the M5 competition which will continue to push forecasting in new directions.In this article we discuss four topics around the practitioners view of the application of the competition and its results to the actual problems we face. First, we examine the data provided and how it relates to common difficulties practitioners must overcome. Secondly, we review the relevance of the accuracy and uncertainty metrics associated with the competition. Third, we discuss the leading solutions and their implications to forecasting at a company like Walmart. We then close with thoughts about a future M6 competition and further enhancements that can be explored.  相似文献   

The M4 competition identified innovative forecasting methods, advancing the theory and practice of forecasting. One of the most promising innovations of M4 was the utilization of cross-learning approaches that allow models to learn from multiple series how to accurately predict individual ones. In this paper, we investigate the potential of cross-learning by developing various neural network models that adopt such an approach, and we compare their accuracy to that of traditional models that are trained in a series-by-series fashion. Our empirical evaluation, which is based on the M4 monthly data, confirms that cross-learning is a promising alternative to traditional forecasting, at least when appropriate strategies for extracting information from large, diverse time series data sets are considered. Ways of combining traditional with cross-learning methods are also examined in order to initiate further research in the field.  相似文献   

We present our solution for the M5 Uncertainty competition. Our solution ranked sixth out of 909 submissions across all hierarchical levels and ranked first for prediction at the finest level of granularity (product-store sales, i.e. SKUs). The model combines a multi-stage state-space model and Monte Carlo simulations to generate the forecasting scenarios (trajectories). Observed sales are modelled with negative binomial distributions to represent discrete over-dispersed sales. Seasonal factors are handcrafted and modelled with linear coefficients that are calculated at the store-department level.  相似文献   

Probabilistic forecasts are necessary for robust decisions in the face of uncertainty. The M5 Uncertainty competition required participating teams to forecast nine quantiles for unit sales of various products at various aggregation levels and for different time horizons. This paper evaluates the forecasting performance of the quantile forecasts at different aggregation levels and at different quantile levels. We contrast this with some theoretical predictions, and discuss potential implications and promising future research directions for the practice of probabilistic forecasting.  相似文献   

The M5 competition uncertainty track aims for probabilistic forecasting of sales of thousands of Walmart retail goods. We show that the M5 competition data face strong overdispersion and sporadic demand, especially zero demand. We discuss modeling issues concerning adequate probabilistic forecasting of such count data processes. Unfortunately, the majority of popular prediction methods used in the M5 competition (e.g. lightgbm and xgboost GBMs) fail to address the data characteristics, due to the considered objective functions. Distributional forecasting provides a suitable modeling approach to overcome those problems. The GAMLSS framework allows for flexible probabilistic forecasting using low-dimensional distributions. We illustrate how the GAMLSS approach can be applied to M5 competition data by modeling the location and scale parameters of various distributions, e.g. the negative binomial distribution. Finally, we discuss software packages for distributional modeling and their drawbacks, like the R package gamlss with its package extensions, and (deep) distributional forecasting libraries such as TensorFlow Probability.  相似文献   

Forecasters typically evaluate the performances of new forecasting methods by exploiting data from past forecasting competitions. Over the years, numerous studies have based their conclusions on such datasets, with mis-performing methods being unlikely to receive any further attention. However, it has been reported that these datasets might not be indicative, as they display many limitations. Since forecasting research is driven somewhat by data from forecasting competitions, it becomes vital to determine whether they are indeed representative of the reality or whether forecasters tend to over-fit their methods on a random sample of series. This paper uses the data from M4 as proportionate to the real world and compares its properties with those of past datasets commonly used in the literature as benchmarks in order to provide evidence on that question. The results show that many popular benchmarks of the past may indeed deviate from reality, and ways forward are discussed in response.  相似文献   

This article introduces the winning method at the M5 Accuracy competition. The presented method takes a simple manner of averaging the results of multiple base forecasting models that have been constructed via partial pooling of multi-level data. All base forecasting models of adopting direct or recursive multi-step forecasting methods are trained by the machine learning technique, LightGBM, from three different levels of data pools. At the competition, the simple averaging of the multiple direct and recursive forecasting models, called DRFAM, obtained the complementary effects between direct and recursive multi-step forecasting of the multi-level product sales to improve the accuracy and the robustness.  相似文献   

This paper reports the results of the NN3 competition, which is a replication of the M3 competition with an extension of the competition towards neural network (NN) and computational intelligence (CI) methods, in order to assess what progress has been made in the 10 years since the M3 competition. Two masked subsets of the M3 monthly industry data, containing 111 and 11 empirical time series respectively, were chosen, controlling for multiple data conditions of time series length (short/long), data patterns (seasonal/non-seasonal) and forecasting horizons (short/medium/long). The relative forecasting accuracy was assessed using the metrics from the M3, together with later extensions of scaled measures, and non-parametric statistical tests. The NN3 competition attracted 59 submissions from NN, CI and statistics, making it the largest CI competition on time series data. Its main findings include: (a) only one NN outperformed the damped trend using the sMAPE, but more contenders outperformed the AutomatANN of the M3; (b) ensembles of CI approaches performed very well, better than combinations of statistical methods; (c) a novel, complex statistical method outperformed all statistical and CI benchmarks; and (d) for the most difficult subset of short and seasonal series, a methodology employing echo state neural networks outperformed all others. The NN3 results highlight the ability of NN to handle complex data, including short and seasonal time series, beyond prior expectations, and thus identify multiple avenues for future research.  相似文献   

This paper explores the issues associated with adapting forecasting techniques used by manufacturers to produce accurate forecasts for retail sales. A case study is presented that is developed using a retail situation because retailers often view their sales forecasting problems as being very different from a manufacturer's problems. Sales volumes are dramatically impacted by competitor promotional actions, discounts, store promotions and weather. Finally, consumption holidays like Christmas, Easter, Mother's day, have a large impact on sales as well as back to school shopping. The findings in this paper indicate that forecasting retail sales can be accomplished with a high degree of accuracy.  相似文献   

The scientific method consists of making hypotheses or predictions and then carrying out experiments to test them once the actual results have become available, in order to learn from both successes and mistakes. This approach was followed in the M4 competition with positive results and has been repeated in the M5, with its organizers submitting their ten predictions/hypotheses about its expected results five days before its launch. The present paper presents these predictions/hypotheses and evaluates their realization according to the actual findings of the competition. The results indicate that well-established practices, like combining forecasts, exploiting explanatory variables, and capturing seasonality and special days, remain critical for enhancing forecasting performance, re-confirming also that relatively new approaches, like cross-learning algorithms and machine learning methods, display great potential. Yet, we show that simple, local statistical methods may still be competitive for forecasting high granularity data and estimating the tails of the uncertainty distribution, thus motivating future research in the field of retail sales forecasting.  相似文献   

This brief note describes two of the forecasting methods used in the M3 Competition, Robust Trend and ARARMA. The origins of these methods are very different. Robust Trend was introduced to model the special features of some telecommunications time series. It was subsequently found to be competitive with Holt’s linear model for the more varied set of time series used in the M1 Competition. The ARARMA methodology was proposed by Parzen as a general time series modelling procedure, and can be thought of as an alternative to the ARIMA methodology of Box and Jenkins. This method was used in the M1 Competition and achieved the lowest mean absolute percentage error for longer forecasting horizons. These methods will be described in more detail and some comments on their use in the M3 Competition conclude this note.  相似文献   

This paper presents a new univariate forecasting method. The method is based on the concept of modifying the local curvature of the time-series through a coefficient ‘Theta’ (the Greek letter θ), that is applied directly to the second differences of the data. The resulting series that are created maintain the mean and the slope of the original data but not their curvatures. These new time series are named Theta-lines. Their primary qualitative characteristic is the improvement of the approximation of the long-term behavior of the data or the augmentation of the short-term features, depending on the value of the Theta coefficient. The proposed method decomposes the original time series into two or more different Theta-lines. These are extrapolated separately and the subsequent forecasts are combined. The simple combination of two Theta-lines, the Theta=0 (straight line) and Theta=2 (double local curves) was adopted in order to produce forecasts for the 3003 series of the M3 competition. The method performed well, particularly for monthly series and for microeconomic data.  相似文献   

Forecasting customer flow is key for retailers in making daily operational decisions, but small retailers often lack the resources to obtain such forecasts. Rather than forecasting stores’ total customer flows, this research utilizes emerging third-party mobile payment data to provide participating stores with a value-added service by forecasting their share of daily customer flows. These customer transactions using mobile payments can then be utilized further to derive retailers’ total customer flows indirectly, thereby overcoming the constraints that small retailers face. We propose a third-party mobile-payment-platform centered daily mobile payments forecasting solution based on an extension of the newly-developed Gradient Boosting Regression Tree (GBRT) method which can generate multi-step forecasts for many stores concurrently. Using empirical forecasting experiments with thousands of time series, we show that GBRT, together with a strategy for multi-period-ahead forecasting, provides more accurate forecasts than established benchmarks. Pooling data from the platform across stores leads to benefits relative to analyzing the data individually, thus demonstrating the value of this machine learning application.  相似文献   

This paper presents our 13th place solution to the M5 Forecasting - Uncertainty challenge and compares it against GoodsForecast’s second-place solution. This challenge aims to estimate the median and eight other quantiles of various product sales in Walmart. Both solutions handle the predictions of median and other quantiles separately. Our solution hybridizes LightGBM and DeepAR in various ways for median and quantile estimation, based on the aggregation levels of the sales. Similarly, GoodsForecast’s solution also utilized a hybrid approach, i.e., LightGBM for point estimation and a Histogram algorithm for quantile estimation. In this paper, the differences between the two solutions and their results are highlighted. Despite our solution only taking 13th place in the challenge with the competition metric, it achieves the lowest average rank based on the multiple comparisons with the best (MCB) test which implies the most accurate forecasts in the majority of the series. It also indicates better performance at the product-store aggregation level which comprises 30,490 (71.2% of all) series compared to most teams.  相似文献   

