首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We review the results of six forecasting competitions based on the online data science platform Kaggle, which have been largely overlooked by the forecasting community. In contrast to the M competitions, the competitions reviewed in this study feature daily and weekly time series with exogenous variables, business hierarchy information, or both. Furthermore, the Kaggle data sets all exhibit higher entropy than the M3 and M4 competitions, and they are intermittent.In this review, we confirm the conclusion of the M4 competition that ensemble models using cross-learning tend to outperform local time series models and that gradient boosted decision trees and neural networks are strong forecast methods. Moreover, we present insights regarding the use of external information and validation strategies, and discuss the impacts of data characteristics on the choice of statistics or machine learning methods. Based on these insights, we construct nine ex-ante hypotheses for the outcome of the M5 competition to allow empirical validation of our findings.  相似文献   

2.
Short-term forecasting of crime   总被引:2,自引:0,他引:2  
The major question investigated is whether it is possible to accurately forecast selected crimes 1 month ahead in small areas, such as police precincts. In a case study of Pittsburgh, PA, we contrast the forecast accuracy of univariate time series models with naïve methods commonly used by police. A major result, expected for the small-scale data of this problem, is that average crime count by precinct is the major determinant of forecast accuracy. A fixed-effects regression model of absolute percent forecast error shows that such counts need to be on the order of 30 or more to achieve accuracy of 20% absolute forecast error or less. A second major result is that practically any model-based forecasting approach is vastly more accurate than current police practices. Holt exponential smoothing with monthly seasonality estimated using city-wide data is the most accurate forecast model for precinct-level crime series.  相似文献   

3.
We test the predictive accuracy of forecasts of the number of COVID-19 fatalities produced by several forecasting teams and collected by the United States Centers for Disease Control and Prevention for the epidemic in the United States. We find three main results. First, at the short horizon (1 week ahead) no forecasting team outperforms a simple time-series benchmark. Second, at longer horizons (3 and 4 week ahead) forecasters are more successful and sometimes outperform the benchmark. Third, one of the best performing forecasts is the Ensemble forecast, that combines all available predictions using uniform weights. In view of these results, collecting a wide range of forecasts and combining them in an ensemble forecast may be a superior approach for health authorities, rather than relying on a small number of forecasts.  相似文献   

4.
The M4 Competition: 100,000 time series and 61 forecasting methods   总被引:1,自引:0,他引:1  
The M4 Competition follows on from the three previous M competitions, the purpose of which was to learn from empirical evidence both how to improve the forecasting accuracy and how such learning could be used to advance the theory and practice of forecasting. The aim of M4 was to replicate and extend the three previous competitions by: (a) significantly increasing the number of series, (b) expanding the number of forecasting methods, and (c) including prediction intervals in the evaluation process as well as point forecasts. This paper covers all aspects of M4 in detail, including its organization and running, the presentation of its results, the top-performing methods overall and by categories, its major findings and their implications, and the computational requirements of the various methods. Finally, it summarizes its main conclusions and states the expectation that its series will become a testing ground for the evaluation of new methods and the improvement of the practice of forecasting, while also suggesting some ways forward for the field.  相似文献   

5.
6.
Macroeconomic forecasting using structural factor analysis   总被引:1,自引:0,他引:1  
The use of a small number of underlying factors to summarize the information from a much larger set of information variables is one of the new frontiers in forecasting. In prior work, the estimated factors have not usually had a structural interpretation and the factors have not been chosen on a theoretical basis. In this paper we propose several variants of a general structural factor forecasting model, and use these to forecast certain key macroeconomic variables. We make the choice of factors more structurally meaningful by estimating factors from subsets of information variables, where these variables can be assigned to subsets on the basis of economic theory. We compare the forecasting performance of the structural factor forecasting model with that of a univariate AR model, a standard VAR model, and some non-structural factor forecasting models. The results suggest that our structural factor forecasting model performs significantly better in forecasting real activity variables, especially at short horizons.  相似文献   

7.
Several researchers (Armstrong, 2001; Clemen, 1989; Makridakis and Winkler, 1983) have shown empirically that combination-based forecasting methods are very effective in real world settings. This paper discusses a combination-based forecasting approach that was used successfully in the M4 competition. The proposed approach was evaluated on a set of 100K time series across multiple domain areas with varied frequencies. The point forecasts submitted finished fourth based on the overall weighted average (OWA) error measure and second based on the symmetric mean absolute percent error (sMAPE).  相似文献   

8.
9.
Can machine-learning algorithms help central banks understand the current state of the economy? Our results say yes! We contribute to the emerging literature on forecasting macroeconomic variables using machine-learning algorithms by testing the nowcast performance of common algorithms in a full ‘real-time’ setting—that is, with real-time vintages of New Zealand GDP growth (our target variable) and real-time vintages of around 600 predictors. Our results show that machine-learning algorithms are able to significantly improve over a simple autoregressive benchmark and a dynamic factor model. We also show that machine-learning algorithms have the potential to add value to, and in one case improve on, the official forecasts of the Reserve Bank of New Zealand.  相似文献   

10.
In this study, we present the results of the M5 “Accuracy” competition, which was the first of two parallel challenges in the latest M competition with the aim of advancing the theory and practice of forecasting. The main objective in the M5 “Accuracy” competition was to accurately predict 42,840 time series representing the hierarchical unit sales for the largest retail company in the world by revenue, Walmart. The competition required the submission of 30,490 point forecasts for the lowest cross-sectional aggregation level of the data, which could then be summed up accordingly to estimate forecasts for the remaining upward levels. We provide details of the implementation of the M5 “Accuracy” challenge, as well as the results and best performing methods, and summarize the major findings and conclusions. Finally, we discuss the implications of these findings and suggest directions for future research.  相似文献   

11.
Forecast combination is a well-established and well-tested approach for improving the forecasting accuracy. One beneficial strategy is to use constituent forecasts that have diverse information. In this paper we consider the idea of diversity being accomplished by using different time aggregations. For example, we could create a yearly time series from a monthly time series and produce forecasts for both, then combine the forecasts. These forecasts would each be tracking the dynamics of different time scales, and would therefore add diverse types of information. A comparison of several forecast combination methods, performed in the context of this setup, shows that this is indeed a beneficial strategy and generally provides a forecasting performance that is better than the performances of the individual forecasts that are combined.As a case study, we consider the problem of forecasting monthly tourism numbers for inbound tourism to Egypt. Specifically, we consider 33 individual source countries, as well as the aggregate. The novel combination strategy also produces a generally improved forecasting accuracy.  相似文献   

12.
This work describes an award winning approach for solving the NN3 Forecasting Competition problem, focusing on the sound experimental validation of its main innovative feature. The NN3 forecasting task consisted of predicting 18 future values of 111 short monthly time series. The main feature of the approach was the use of the median for combining the forecasts of an ensemble of 15 MLPs to predict each time series. Experimental comparison to a single MLP shows that the ensemble increases the performance accuracy for multiple-step ahead forecasting. This system performed well on the withheld data, having finished as the second best solution of the competition with an SMAPE of 16.17%.  相似文献   

13.
Researchers from various scientific disciplines have attempted to forecast the spread of coronavirus disease 2019 (COVID-19). The proposed epidemic prediction methods range from basic curve fitting methods and traffic interaction models to machine-learning approaches. If we combine all these approaches, we obtain the Network Inference-based Prediction Algorithm (NIPA). In this paper, we analyse a diverse set of COVID-19 forecast algorithms, including several modifications of NIPA. Among the algorithms that we evaluated, the original NIPA performed best at forecasting the spread of COVID-19 in Hubei, China and in the Netherlands. In particular, we show that network-based forecasting is superior to any other forecasting algorithm.  相似文献   

14.
In this work we introduce the forecasting model with which we participated in the NN5 forecasting competition (the forecasting of 111 time series representing daily cash withdrawal amounts at ATM machines). The main idea of this model is to utilize the concept of forecast combination, which has proven to be an effective methodology in the forecasting literature. In the proposed system we attempted to follow a principled approach, and make use of some of the guidelines and concepts that are known in the forecasting literature to lead to superior performance. For example, we considered various previous comparison studies and time series competitions as guidance in determining which individual forecasting models to test (for possible inclusion in the forecast combination system). The final model ended up consisting of neural networks, Gaussian process regression, and linear models, combined by simple average. We also paid extra attention to the seasonality aspect, decomposing the seasonality into weekly (which is the strongest one), day of the month, and month of the year seasonality.  相似文献   

15.
This article introduces the winning method at the M5 Accuracy competition. The presented method takes a simple manner of averaging the results of multiple base forecasting models that have been constructed via partial pooling of multi-level data. All base forecasting models of adopting direct or recursive multi-step forecasting methods are trained by the machine learning technique, LightGBM, from three different levels of data pools. At the competition, the simple averaging of the multiple direct and recursive forecasting models, called DRFAM, obtained the complementary effects between direct and recursive multi-step forecasting of the multi-level product sales to improve the accuracy and the robustness.  相似文献   

16.
This paper describes the M5 “Uncertainty” competition, the second of two parallel challenges of the latest M competition, aiming to advance the theory and practice of forecasting. The particular objective of the M5 “Uncertainty” competition was to accurately forecast the uncertainty distributions of the realized values of 42,840 time series that represent the hierarchical unit sales of the largest retail company in the world by revenue, Walmart. To do so, the competition required the prediction of nine different quantiles (0.005, 0.025, 0.165, 0.250, 0.500, 0.750, 0.835, 0.975, and 0.995), that can sufficiently describe the complete distributions of future sales. The paper provides details on the implementation and execution of the M5 “Uncertainty” competition, presents its results and the top-performing methods, and summarizes its major findings and conclusions. Finally, it discusses the implications of its findings and suggests directions for future research.  相似文献   

17.
The increasing penetration of intermittent renewable energy in power systems brings operational challenges. One way of supporting them is by enhancing the predictability of renewables through accurate forecasting. Convolutional Neural Networks (Convnets) provide a successful technique for processing space-structured multi-dimensional data. In our work, we propose the U-Convolutional model to predict hourly wind speeds for a single location using spatio-temporal data with multiple explanatory variables as an input. The U-Convolutional model is composed of a U-Net part, which synthesizes input information, and a Convnet part, which maps the synthesized data into a single-site wind prediction. We compare our approach with advanced Convnets, a fully connected neural network, and univariate models. We use time series from the Climate Forecast System Reanalysis as datasets and select temperature and u- and v-components of wind as explanatory variables. The proposed models are evaluated at multiple locations (totaling 181 target series) and multiple forecasting horizons. The results indicate that our proposal is promising for spatio-temporal wind speed prediction, with results that show competitive performance on both time horizons for all datasets.  相似文献   

18.
The main objective of the M5 competition, which focused on forecasting the hierarchical unit sales of Walmart, was to evaluate the accuracy and uncertainty of forecasting methods in the field to identify best practices and highlight their practical implications. However, can the findings of the M5 competition be generalized and exploited by retail firms to better support their decisions and operation? This depends on the extent to which M5 data is sufficiently similar to unit sales data of retailers operating in different regions selling different product types and considering different marketing strategies. To answer this question, we analyze the characteristics of the M5 time series and compare them with those of two grocery retailers, namely Corporación Favorita and a major Greek supermarket chain, using feature spaces. Our results suggest only minor discrepancies between the examined data sets, supporting the representativeness of the M5 data.  相似文献   

19.
Cooperation between different data owners may lead to an improvement in forecast quality—for instance, by benefiting from spatiotemporal dependencies in geographically distributed time series. Due to business competitive factors and personal data protection concerns, however, said data owners might be unwilling to share their data. Interest in collaborative privacy-preserving forecasting is thus increasing. This paper analyzes the state-of-the-art and unveils several shortcomings of existing methods in guaranteeing data privacy when employing vector autoregressive models. The methods are divided into three groups: data transformation, secure multi-party computations, and decomposition methods. The analysis shows that state-of-the-art techniques have limitations in preserving data privacy, such as (i) the necessary trade-off between privacy and forecasting accuracy, empirically evaluated through simulations and real-world experiments based on solar data; and (ii) iterative model fitting processes, which reveal data after a number of iterations.  相似文献   

20.
We participated in the M4 competition for time series forecasting and here describe our methods for forecasting daily time series. We used an ensemble of five statistical forecasting methods and a method that we refer to as the correlator. Our retrospective analysis using the ground truth values published by the M4 organisers after the competition demonstrates that the correlator was responsible for most of our gains over the naïve constant forecasting method. We identify data leakage as one reason for its success, due partly to test data selected from different time intervals, and partly to quality issues with the original time series. We suggest that future forecasting competitions should provide actual dates for the time series so that some of these leakages could be avoided by participants.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号