首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 655 毫秒
1.
Forecasting monthly and quarterly time series using STL decomposition   总被引:1,自引:0,他引:1  
This paper is a re-examination of the benefits and limitations of decomposition and combination techniques in the area of forecasting, and also a contribution to the field, offering a new forecasting method. The new method is based on the disaggregation of time series components through the STL decomposition procedure, the extrapolation of linear combinations of the disaggregated sub-series, and the reaggregation of the extrapolations to obtain estimates for the global series. Applying the forecasting method to data from the NN3 and M1 Competition series, the results suggest that it can perform well relative to four other standard statistical techniques from the literature, namely the ARIMA, Theta, Holt-Winters’ and Holt’s Damped Trend methods. The relative advantages of the new method are then investigated further relative to a simple combination of the four statistical methods and a Classical Decomposition forecasting method. The strength of the method lies in its ability to predict long lead times with relatively high levels of accuracy, and to perform consistently well for a wide range of time series, irrespective of the characteristics, underlying structure and level of noise of the data.  相似文献   

2.
Combination methods have performed well in time series forecast competitions. This study proposes a simple but general methodology for combining time series forecast methods. Weights are calculated using a cross-validation scheme that assigns greater weights to methods with more accurate in-sample predictions. The methodology was used to combine forecasts from the Theta, exponential smoothing, and ARIMA models, and placed fifth in the M4 Competition for both point and interval forecasting.  相似文献   

3.
We participated in the M4 competition for time series forecasting and here describe our methods for forecasting daily time series. We used an ensemble of five statistical forecasting methods and a method that we refer to as the correlator. Our retrospective analysis using the ground truth values published by the M4 organisers after the competition demonstrates that the correlator was responsible for most of our gains over the naïve constant forecasting method. We identify data leakage as one reason for its success, due partly to test data selected from different time intervals, and partly to quality issues with the original time series. We suggest that future forecasting competitions should provide actual dates for the time series so that some of these leakages could be avoided by participants.  相似文献   

4.
The M4 Competition: 100,000 time series and 61 forecasting methods   总被引:1,自引:0,他引:1  
The M4 Competition follows on from the three previous M competitions, the purpose of which was to learn from empirical evidence both how to improve the forecasting accuracy and how such learning could be used to advance the theory and practice of forecasting. The aim of M4 was to replicate and extend the three previous competitions by: (a) significantly increasing the number of series, (b) expanding the number of forecasting methods, and (c) including prediction intervals in the evaluation process as well as point forecasts. This paper covers all aspects of M4 in detail, including its organization and running, the presentation of its results, the top-performing methods overall and by categories, its major findings and their implications, and the computational requirements of the various methods. Finally, it summarizes its main conclusions and states the expectation that its series will become a testing ground for the evaluation of new methods and the improvement of the practice of forecasting, while also suggesting some ways forward for the field.  相似文献   

5.
We examine the effect of damping X-12-ARIMA's estimated seasonal variation on the accuracy of its seasonal adjustments of time series. Two methods for damping seasonals are proposed. In a simulation experiment, we generated time series data for each of 90 distinct experimental conditions that, in aggregate, characterize the variety of monthly series in the M3-competition. X-12-ARIMA consistently overestimated the actual seasonal variation by an amount consistent with statistical theory. Damping seasonals reduced X-12-ARIMA's estimation error by as much as 79% and under no conditions was estimation error increased beyond a trivial amount. Improvement depended primarily on the degree to which random variation in a series dominated seasonal variation. When the multiplicative X-12-ARIMA model did not match the data-generating model, overestimation was less for trend series than for series with no trend; otherwise the presence of trend had no discernible effect. One of the proposed methods was somewhat more accurate and robust, but more complex, than the other. In an analysis of real data—the 1428 monthly series of the M3-competition-damping X-12-ARIMA seasonals prior to forecasting (1) reduced the average forecasting MAPE by 4.9–1.4% and (2) improved forecasting accuracy for 59–65% of the series, depending on the forecasting horizon. This research suggests that damping X-12-ARIMA seasonals leads to more accurate seasonal adjustments of time series, thus providing a more reliable basis for policy-making, forecasting, and the evaluation of forecasting methods by researchers.  相似文献   

6.
The well-developed ETS (ExponenTial Smoothing, or Error, Trend, Seasonality) method incorporates a family of exponential smoothing models in state space representation and is widely used for automatic forecasting. The existing ETS method uses information criteria for model selection by choosing an optimal model with the smallest information criterion among all models fitted to a given time series. The ETS method under such a model selection scheme suffers from computational complexity when applied to large-scale time series data. To tackle this issue, we propose an efficient approach to ETS model selection by training classifiers on simulated data to predict appropriate model component forms for a given time series. We provide a simulation study to show the model selection ability of the proposed approach on simulated data. We evaluate our approach on the widely used M4 forecasting competition dataset in terms of both point forecasts and prediction intervals. To demonstrate the practical value of our method, we showcase the performance improvements from our approach on a monthly hospital dataset.  相似文献   

7.
This paper reports the results of the NN3 competition, which is a replication of the M3 competition with an extension of the competition towards neural network (NN) and computational intelligence (CI) methods, in order to assess what progress has been made in the 10 years since the M3 competition. Two masked subsets of the M3 monthly industry data, containing 111 and 11 empirical time series respectively, were chosen, controlling for multiple data conditions of time series length (short/long), data patterns (seasonal/non-seasonal) and forecasting horizons (short/medium/long). The relative forecasting accuracy was assessed using the metrics from the M3, together with later extensions of scaled measures, and non-parametric statistical tests. The NN3 competition attracted 59 submissions from NN, CI and statistics, making it the largest CI competition on time series data. Its main findings include: (a) only one NN outperformed the damped trend using the sMAPE, but more contenders outperformed the AutomatANN of the M3; (b) ensembles of CI approaches performed very well, better than combinations of statistical methods; (c) a novel, complex statistical method outperformed all statistical and CI benchmarks; and (d) for the most difficult subset of short and seasonal series, a methodology employing echo state neural networks outperformed all others. The NN3 results highlight the ability of NN to handle complex data, including short and seasonal time series, beyond prior expectations, and thus identify multiple avenues for future research.  相似文献   

8.
This work describes an award winning approach for solving the NN3 Forecasting Competition problem, focusing on the sound experimental validation of its main innovative feature. The NN3 forecasting task consisted of predicting 18 future values of 111 short monthly time series. The main feature of the approach was the use of the median for combining the forecasts of an ensemble of 15 MLPs to predict each time series. Experimental comparison to a single MLP shows that the ensemble increases the performance accuracy for multiple-step ahead forecasting. This system performed well on the withheld data, having finished as the second best solution of the competition with an SMAPE of 16.17%.  相似文献   

9.
The main objective of the M5 competition, which focused on forecasting the hierarchical unit sales of Walmart, was to evaluate the accuracy and uncertainty of forecasting methods in the field to identify best practices and highlight their practical implications. However, can the findings of the M5 competition be generalized and exploited by retail firms to better support their decisions and operation? This depends on the extent to which M5 data is sufficiently similar to unit sales data of retailers operating in different regions selling different product types and considering different marketing strategies. To answer this question, we analyze the characteristics of the M5 time series and compare them with those of two grocery retailers, namely Corporación Favorita and a major Greek supermarket chain, using feature spaces. Our results suggest only minor discrepancies between the examined data sets, supporting the representativeness of the M5 data.  相似文献   

10.
Forecasters typically evaluate the performances of new forecasting methods by exploiting data from past forecasting competitions. Over the years, numerous studies have based their conclusions on such datasets, with mis-performing methods being unlikely to receive any further attention. However, it has been reported that these datasets might not be indicative, as they display many limitations. Since forecasting research is driven somewhat by data from forecasting competitions, it becomes vital to determine whether they are indeed representative of the reality or whether forecasters tend to over-fit their methods on a random sample of series. This paper uses the data from M4 as proportionate to the real world and compares its properties with those of past datasets commonly used in the literature as benchmarks in order to provide evidence on that question. The results show that many popular benchmarks of the past may indeed deviate from reality, and ways forward are discussed in response.  相似文献   

11.
Short-term forecasting of crime   总被引:2,自引:0,他引:2  
The major question investigated is whether it is possible to accurately forecast selected crimes 1 month ahead in small areas, such as police precincts. In a case study of Pittsburgh, PA, we contrast the forecast accuracy of univariate time series models with naïve methods commonly used by police. A major result, expected for the small-scale data of this problem, is that average crime count by precinct is the major determinant of forecast accuracy. A fixed-effects regression model of absolute percent forecast error shows that such counts need to be on the order of 30 or more to achieve accuracy of 20% absolute forecast error or less. A second major result is that practically any model-based forecasting approach is vastly more accurate than current police practices. Holt exponential smoothing with monthly seasonality estimated using city-wide data is the most accurate forecast model for precinct-level crime series.  相似文献   

12.
The M4 competition is the continuation of three previous competitions started more than 45 years ago whose purpose was to learn how to improve forecasting accuracy, and how such learning can be applied to advance the theory and practice of forecasting. The purpose of M4 was to replicate the results of the previous ones and extend them into three directions: First significantly increase the number of series, second include Machine Learning (ML) forecasting methods, and third evaluate both point forecasts and prediction intervals. The five major findings of the M4 Competitions are: 1. Out Of the 17 most accurate methods, 12 were “combinations” of mostly statistical approaches. 2. The biggest surprise was a “hybrid” approach that utilized both statistical and ML features. This method’s average sMAPE was close to 10% more accurate than the combination benchmark used to compare the submitted methods. 3. The second most accurate method was a combination of seven statistical methods and one ML one, with the weights for the averaging being calculated by a ML algorithm that was trained to minimize the forecasting. 4. The two most accurate methods also achieved an amazing success in specifying the 95% prediction intervals correctly. 5. The six pure ML methods performed poorly, with none of them being more accurate than the combination benchmark and only one being more accurate than Naïve2. This paper presents some initial results of M4, its major findings and a logical conclusion. Finally, it outlines what the authors consider to be the way forward for the field of forecasting.  相似文献   

13.
Several researchers (Armstrong, 2001; Clemen, 1989; Makridakis and Winkler, 1983) have shown empirically that combination-based forecasting methods are very effective in real world settings. This paper discusses a combination-based forecasting approach that was used successfully in the M4 competition. The proposed approach was evaluated on a set of 100K time series across multiple domain areas with varied frequencies. The point forecasts submitted finished fourth based on the overall weighted average (OWA) error measure and second based on the symmetric mean absolute percent error (sMAPE).  相似文献   

14.
The M5 competition follows the previous four M competitions, whose purpose is to learn from empirical evidence how to improve forecasting performance and advance the theory and practice of forecasting. M5 focused on a retail sales forecasting application with the objective to produce the most accurate point forecasts for 42,840 time series that represent the hierarchical unit sales of the largest retail company in the world, Walmart, as well as to provide the most accurate estimates of the uncertainty of these forecasts. Hence, the competition consisted of two parallel challenges, namely the Accuracy and Uncertainty forecasting competitions. M5 extended the results of the previous M competitions by: (a) significantly expanding the number of participating methods, especially those in the category of machine learning; (b) evaluating the performance of the uncertainty distribution along with point forecast accuracy; (c) including exogenous/explanatory variables in addition to the time series data; (d) using grouped, correlated time series; and (e) focusing on series that display intermittency. This paper describes the background, organization, and implementations of the competition, and it presents the data used and their characteristics. Consequently, it serves as introductory material to the results of the two forecasting challenges to facilitate their understanding.  相似文献   

15.
This paper considers two problems of interpreting forecasting competition error statistics. The first problem is concerned with the importance of linking the error measure (loss function) used in evaluating a forecasting model with the loss function used in estimating the model. It is argued that because the variety of uses of any single forecast, such matching is impractical. Secondly, there is little evidence that matching would have any impact on comparative forecast performance, however measured. As a consequence the results of forecasting competitions are not affected by this problem. The second problem is concerned with the interpreting performance, when evaluated through M(ean) S(quare) E(rror). The authors show that in the Makridakis Competition, good MSE performance is solely due to performance on a small number of the 1001 series, and arises because of the effects of scale. They conclude that comparisons of forecasting accuracy based on MSE are subject to major problems of interpretation.  相似文献   

16.
We review the results of six forecasting competitions based on the online data science platform Kaggle, which have been largely overlooked by the forecasting community. In contrast to the M competitions, the competitions reviewed in this study feature daily and weekly time series with exogenous variables, business hierarchy information, or both. Furthermore, the Kaggle data sets all exhibit higher entropy than the M3 and M4 competitions, and they are intermittent.In this review, we confirm the conclusion of the M4 competition that ensemble models using cross-learning tend to outperform local time series models and that gradient boosted decision trees and neural networks are strong forecast methods. Moreover, we present insights regarding the use of external information and validation strategies, and discuss the impacts of data characteristics on the choice of statistics or machine learning methods. Based on these insights, we construct nine ex-ante hypotheses for the outcome of the M5 competition to allow empirical validation of our findings.  相似文献   

17.
The M4 competition identified innovative forecasting methods, advancing the theory and practice of forecasting. One of the most promising innovations of M4 was the utilization of cross-learning approaches that allow models to learn from multiple series how to accurately predict individual ones. In this paper, we investigate the potential of cross-learning by developing various neural network models that adopt such an approach, and we compare their accuracy to that of traditional models that are trained in a series-by-series fashion. Our empirical evaluation, which is based on the M4 monthly data, confirms that cross-learning is a promising alternative to traditional forecasting, at least when appropriate strategies for extracting information from large, diverse time series data sets are considered. Ways of combining traditional with cross-learning methods are also examined in order to initiate further research in the field.  相似文献   

18.
This paper introduces a novel meta-learning algorithm for time series forecast model performance prediction. We model the forecast error as a function of time series features calculated from historical time series with an efficient Bayesian multivariate surface regression approach. The minimum predicted forecast error is then used to identify an individual model or a combination of models to produce the final forecasts. It is well known that the performance of most meta-learning models depends on the representativeness of the reference dataset used for training. In such circumstances, we augment the reference dataset with a feature-based time series simulation approach, namely GRATIS, to generate a rich and representative time series collection. The proposed framework is tested using the M4 competition data and is compared against commonly used forecasting approaches. Our approach provides comparable performance to other model selection and combination approaches but at a lower computational cost and a higher degree of interpretability, which is important for supporting decisions. We also provide useful insights regarding which forecasting models are expected to work better for particular types of time series, the intrinsic mechanisms of the meta-learners, and how the forecasting performance is affected by various factors.  相似文献   

19.
This article has three objectives: (a) to describe the method of automatic ARIMA modeling (AAM), with and without intervention analysis, that has been used in the analysis; (b) to comment on the results; and (c) to comment on the M3 Competition in general. Starting with a computer program for fitting an ARIMA model and a methodology for building univariate ARIMA models, an expert system has been built, while trying to avoid the pitfalls of most existing software packages. A software package called Time Series Expert TSE-AX is used to build a univariate ARIMA model with or without an intervention analysis. The characteristics of TSE-AX are summarized and, more especially, its automatic ARIMA modeling method. The motivation to take part in the M3-Competition is also outlined. The methodology is described mainly in three technical appendices: (Appendix A) choice of differences and of a transformation, use of intervention analysis; ( Appendix B) available specification procedures; ( Appendix C) adequacy, model checking and new specification. The problems raised by outliers are discussed, in particular how close they are from the forecast origin. Several series that are difficult to deal with from that point of view are mentioned and one of them is shown. In the last section, we comment on contextual information, the idea of an e−M Competition, prediction intervals and the possible use of other forecasting methods within Time Series Expert.  相似文献   

20.
Rule-based forecasting (RBF) uses rules to combine forecasts from simple extrapolation methods. Weights for combining the rules use statistical and domain-based features of time series. RBF was originally developed, tested, and validated only on annual data. For the M3-Competition, three major modifications were made to RBF. First, due to the absence of much in the way of domain knowledge, we prepared the forecasts under the assumption that no domain knowledge was available. This removes what we believe is one of RBF’s primary advantages. We had to re-calibrate some of the rules relating to causal forces to allow for this lack of domain knowledge. Second, automatic identification procedures were used for six time-series features that had previously been identified using judgment. This was done to reduce cost and improve reliability. Third, we simplified the rule-base by removing one method from the four that were used in the original implementation. Although this resulted in some loss in accuracy, it reduced the number of rules in the rule-base from 99 to 64. This version of RBF still benefits from the use of prior findings on extrapolation, so we expected that it would be substantially more accurate than the random walk and somewhat more accurate than equal weights combining. Because most of the previous work on RBF was done using annual data, we especially expected it to perform well with annual data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号