共查询到20条相似文献,搜索用时 0 毫秒
1.
《International Journal of Forecasting》2022,38(4):1346-1364
In this study, we present the results of the M5 “Accuracy” competition, which was the first of two parallel challenges in the latest M competition with the aim of advancing the theory and practice of forecasting. The main objective in the M5 “Accuracy” competition was to accurately predict 42,840 time series representing the hierarchical unit sales for the largest retail company in the world by revenue, Walmart. The competition required the submission of 30,490 point forecasts for the lowest cross-sectional aggregation level of the data, which could then be summed up accordingly to estimate forecasts for the remaining upward levels. We provide details of the implementation of the M5 “Accuracy” challenge, as well as the results and best performing methods, and summarize the major findings and conclusions. Finally, we discuss the implications of these findings and suggest directions for future research. 相似文献
2.
《International Journal of Forecasting》2022,38(4):1555-1561
Machine learning (ML) methods are gaining popularity in the forecasting field, as they have shown strong empirical performance in the recent M4 and M5 competitions, as well as in several Kaggle competitions. However, understanding why and how these methods work well for forecasting is still at a very early stage, partly due to their complexity. In this paper, I present a framework for regression-based ML that provides researchers with a common language and abstraction to aid in their study. To demonstrate the utility of the framework, I show how it can be used to map and compare ML methods used in the M5 Uncertainty competition. I then describe how the framework can be used together with ablation testing to systematically study their performance. Lastly, I use the framework to provide an overview of the solution space in regression-based ML forecasting, identifying areas for further research. 相似文献
3.
Periklis Gogas Theophilos Papadimitriou Anna Agrapetidou 《International Journal of Forecasting》2018,34(3):440-455
This paper presents a forecasting model of bank failures based on machine-learning. The proposed methodology defines a linear decision boundary that separates the solvent banks from those that failed. This setup generates a novel alternative stress-testing tool. Our sample of 1443 U.S. banks includes all 481 banks that failed during the period 2007–2013. The set of explanatory variables is selected using a two-step feature selection procedure. The selected variables were then fed to a support vector machines forecasting model, through a training–testing learning process. The model exhibits a 99.22% overall forecasting accuracy and outperforms the well-established Ohlson’s score. 相似文献
4.
《International Journal of Forecasting》2022,38(4):1519-1525
The M5 forecasting competition has provided strong empirical evidence that machine learning methods can outperform statistical methods: in essence, complex methods can be more accurate than simple ones. Regardless, this result challenges the flagship empirical result that led the forecasting discipline for the last four decades: keep methods sophisticatedly simple. Nevertheless, this was a first, and we can argue that this will not happen again. There has been a different winner in each forecasting competition. This inevitably raises the question: can a method win more than once (and should it be expected to)? Furthermore, we argue for the need to elaborate on the perks of competing methods, and what makes them winners? 相似文献
5.
This article considers nine different predictive techniques, including state-of-the-art machine learning methods for forecasting corporate bond yield spreads with other input variables. We examine each method’s out-of-sample forecasting performance using two different forecast horizons: (1) the in-sample dataset over 2003–2007 is used for one-year-ahead and two-year-ahead forecasts of non-callable corporate bond yield spreads; and (2) the in-sample dataset over 2003–2008 is considered to forecast the yield spreads in 2009. Evaluations of forecasting accuracy have shown that neural network forecasts are superior to the other methods considered here in both the short and longer horizon. Furthermore, we visualize the determinants of yield spreads and find that a firm’s equity volatility is a critical factor in yield spreads. 相似文献
6.
《International Journal of Forecasting》2020,36(4):1260-1289
This study uses innovative tools recently proposed in the statistical learning literature to assess the capability of standard exchange rate models to predict the exchange rate in the short and long runs. Our results show that statistical learning methods deliver remarkably good performance, outperforming the random walk in forecasting the exchange rate at different forecasting horizons, with the exception of the very short term (a period of one to two months). These results were robust across countries, time, and models. We then used these tools to compare the predictive capabilities of different exchange rate models and model specifications, and found that sticky price versions of the monetary model with an error correction specification delivered the best performance. We also explain the operation of the statistical learning models by developing measures of variable importance and analyzing the kind of relationship that links each variable with the outcome. This gives us a better understanding of the relationship between the exchange rate and economic fundamentals, which appears complex and characterized by strong non-linearities. 相似文献
7.
《International Journal of Forecasting》2020,36(1):54-74
The M4 Competition follows on from the three previous M competitions, the purpose of which was to learn from empirical evidence both how to improve the forecasting accuracy and how such learning could be used to advance the theory and practice of forecasting. The aim of M4 was to replicate and extend the three previous competitions by: (a) significantly increasing the number of series, (b) expanding the number of forecasting methods, and (c) including prediction intervals in the evaluation process as well as point forecasts. This paper covers all aspects of M4 in detail, including its organization and running, the presentation of its results, the top-performing methods overall and by categories, its major findings and their implications, and the computational requirements of the various methods. Finally, it summarizes its main conclusions and states the expectation that its series will become a testing ground for the evaluation of new methods and the improvement of the practice of forecasting, while also suggesting some ways forward for the field. 相似文献
8.
《International Journal of Forecasting》2022,38(4):1365-1385
This paper describes the M5 “Uncertainty” competition, the second of two parallel challenges of the latest M competition, aiming to advance the theory and practice of forecasting. The particular objective of the M5 “Uncertainty” competition was to accurately forecast the uncertainty distributions of the realized values of 42,840 time series that represent the hierarchical unit sales of the largest retail company in the world by revenue, Walmart. To do so, the competition required the prediction of nine different quantiles (0.005, 0.025, 0.165, 0.250, 0.500, 0.750, 0.835, 0.975, and 0.995), that can sufficiently describe the complete distributions of future sales. The paper provides details on the implementation and execution of the M5 “Uncertainty” competition, presents its results and the top-performing methods, and summarizes its major findings and conclusions. Finally, it discusses the implications of its findings and suggests directions for future research. 相似文献
9.
Is it possible to predict malfeasance in public procurement? With the proliferation of e-procurement systems in the public sector, anti-corruption agencies and watchdog organizations have access to valuable sources of information with which to identify transactions that are likely to become troublesome and why. In this article, we discuss the promises and challenges of using machine learning models to predict inefficiency and corruption in public procurement. We illustrate this approach with a dataset with more than two million public procurement contracts in Colombia. We trained machine learning models to predict which of them will result in corruption investigations, a breach of contract, or implementation inefficiencies. We then discuss how our models can help practitioners better understand the drivers of corruption and inefficiency in public procurement. Our approach will be useful to governments interested in exploiting large administrative datasets to improve the provision of public goods, and it highlights some of the tradeoffs and challenges that they might face throughout this process. 相似文献
10.
《International Journal of Forecasting》2021,37(4):1338-1354
In a low-dimensional linear regression setup, considering linear transformations/combinations of predictors does not alter predictions. However, when the forecasting technology either uses shrinkage or is nonlinear, it does. This is precisely the fabric of the machine learning (ML) macroeconomic forecasting environment. Pre-processing of the data translates to an alteration of the regularization – explicit or implicit – embedded in ML algorithms. We review old transformations and propose new ones, then empirically evaluate their merits in a substantial pseudo-out-sample exercise. It is found that traditional factors should almost always be included as predictors and moving average rotations of the data can provide important gains for various forecasting targets. Also, we note that while predicting directly the average growth rate is equivalent to averaging separate horizon forecasts when using OLS-based techniques, the latter can substantially improve on the former when regularization and/or nonparametric nonlinearities are involved. 相似文献
11.
Solar energy is one of the fastest growing sources of electricity generation. Forecasting solar stock prices is important for investors and venture capitalists interested in the renewable energy sector. This paper uses tree-based machine learning methods to forecast the direction of solar stock prices. The feature set used in prediction includes a selection of well-known technical indicators, silver prices, silver price volatility, and oil price volatility. The solar stock price direction prediction accuracy of random forests, bagging, support vector machines, and extremely randomized trees is much higher than that of logit. For a forecast horizon of between 8 and 20 days, random forests, bagging, support vector machines, and extremely randomized trees achieve a prediction accuracy greater than 85%. Although not as prominent as technical indicators like MA200, WAD, and MA20, oil price volatility and silver price volatility are also important predictors. An investment portfolio trading strategy based on trading signals generated from the extremely randomized trees stock price direction prediction outperforms a simple buy and hold strategy. These results demonstrate the accuracy of using tree-based machine learning methods to forecast the direction of solar stock prices and adds to the broader literature on using machine learning techniques to forecast stock prices. 相似文献
12.
《International Journal of Forecasting》2020,36(1):37-53
Forecasters typically evaluate the performances of new forecasting methods by exploiting data from past forecasting competitions. Over the years, numerous studies have based their conclusions on such datasets, with mis-performing methods being unlikely to receive any further attention. However, it has been reported that these datasets might not be indicative, as they display many limitations. Since forecasting research is driven somewhat by data from forecasting competitions, it becomes vital to determine whether they are indeed representative of the reality or whether forecasters tend to over-fit their methods on a random sample of series. This paper uses the data from M4 as proportionate to the real world and compares its properties with those of past datasets commonly used in the literature as benchmarks in order to provide evidence on that question. The results show that many popular benchmarks of the past may indeed deviate from reality, and ways forward are discussed in response. 相似文献
13.
Adam Richardson Thomas van Florenstein Mulder Tuğrul Vehbi 《International Journal of Forecasting》2021,37(2):941-948
Can machine-learning algorithms help central banks understand the current state of the economy? Our results say yes! We contribute to the emerging literature on forecasting macroeconomic variables using machine-learning algorithms by testing the nowcast performance of common algorithms in a full ‘real-time’ setting—that is, with real-time vintages of New Zealand GDP growth (our target variable) and real-time vintages of around 600 predictors. Our results show that machine-learning algorithms are able to significantly improve over a simple autoregressive benchmark and a dynamic factor model. We also show that machine-learning algorithms have the potential to add value to, and in one case improve on, the official forecasts of the Reserve Bank of New Zealand. 相似文献
14.
Krzysztof Rybinski 《International Journal of Forecasting》2021,37(1):186-204
This article presents the first ever ranking of professional forecasters based on the predictive power of the narrative of their regular research reports. The ranking is generated by applying the fully automated four-step procedure – called NLP-ForRank – developed in this article. The four steps are data scraping from the internet; data preparation; application of the natural language processing (NLP) models; and evaluation of the predictive power of the NLP indexes with linear regression, Granger causality, vector autoregression (VAR), and random forest forecasting models. Applying this procedure to five large Polish banks and to many time series shows that including the constructed NLP indexes in the forecasting models lowers the forecast errors, and that the optimal model almost always contains the NLP index. The financial news agencies could consider publishing this type of ranking on a regular basis as it would foster accountability, transparency, and a more competitive environment in the professional forecasting industry. 相似文献
15.
《International Journal of Forecasting》2019,35(1):390-407
Stock markets can be interpreted to a certain extent as prediction markets, since they can incorporate and represent the different opinions of investors who disagree on the implications of the available information on past and expected events and trade on their beliefs in order to achieve profits. Many forecast models have been developed for predicting the future state of stock markets, with the aim of using this knowledge in a trading strategy. This paper interprets the classification of the S&P500 open-to-close returns as a four-class problem. We compare four trading strategies based on a random forest classifier to a buy-and-hold strategy. The results show that predicting the classes with higher absolute returns, ‘strong positive’ and ‘strong negative’, contributed the most to the trading strategies on average. This finding can help shed light on the way in which using additional event outcomes for the classification beyond a simple upward or downward movement can potentially improve a trading strategy. 相似文献
16.
《International Journal of Forecasting》2020,36(1):98-104
Several researchers (Armstrong, 2001; Clemen, 1989; Makridakis and Winkler, 1983) have shown empirically that combination-based forecasting methods are very effective in real world settings. This paper discusses a combination-based forecasting approach that was used successfully in the M4 competition. The proposed approach was evaluated on a set of 100K time series across multiple domain areas with varied frequencies. The point forecasts submitted finished fourth based on the overall weighted average (OWA) error measure and second based on the symmetric mean absolute percent error (sMAPE). 相似文献
17.
This study explores whether the relationship between Japanese yen futures returns and the corresponding equity returns is affected by the states of psychological anchors of the currency and stock markets. This study employs the linear-regression-based tree model (a machine learning method) to account for the framing effect of the anchors. The empirical results of the linear-regression-based tree model show that the currency price behaviors of momentum and reversal, and prediction by equity markets, vary with the anchors. Empirical evidence also indicates that the linear-regression-based tree model outperforms the OLS model based on the estimation results and out-of-sample forecasting. The forecasting performance of the linear-regression-based tree model can be improved along with an increase in the forecasting period. 相似文献
18.
《International Journal of Forecasting》2020,36(2):248-266
Since the introduction of the Basel II Accord, and given its huge implications for credit risk management, the modeling and prediction of the loss given default (LGD) have become increasingly important tasks. Institutions which use their own LGD estimates can build either simpler or more complex methods. Simpler methods are easier to implement and more interpretable, but more complex methods promise higher prediction accuracies. Using a proprietary data set of 1,184 defaulted corporate leases in Germany, this study explores different parametric, semi-parametric and non-parametric approaches that attempt to predict the LGD. By conducting the analyses for different information sets, we study how the prediction accuracy changes depending on the set of information that is available. Furthermore, we use a variable importance measure to identify the input variables that have the greatest effects on the LGD prediction accuracy for each method. In this regard, we provide new insights on the characteristics of leasing LGDs. We find that (1) more sophisticated methods, especially the random forest, lead to remarkable increases in the prediction accuracy; (2) updating information improves the prediction accuracy considerably; and (3) the outstanding exposure at default, an internal rating, asset types and lessor industries turn out to be important drivers of accurate LGD predictions. 相似文献
19.
《International Journal of Forecasting》2022,38(3):895-909
The continuous growth of available football data presents unprecedented research opportunities for a better understanding of football dynamics. While many research works focus on predicting which team will win a match, other interesting questions, such as whether both teams will score in a game, are still unexplored and have gained momentum with the rise of betting markets. With this in mind, we investigate the following research questions in this paper: “How difficult is the ‘both teams to score’ (BTTS) prediction problem?”, “Are machine learning classifiers capable of predicting BTTS better than bookmakers?”, and “Are machine learning classifiers useful for devising profitable betting strategies in the BTTS market?”. We collected historical football data, extracted groups of features to represent the teams’ strengths, and fed these to state-of-the-art classification models. We performed a comprehensive set of experiments and showed that, although hard to predict, in some scenarios it is possible to outperform bookmakers, which are robust baselines per se. More importantly, in some cases it is possible to beat the market and devise profitable strategies based on machine learning algorithms. The results are encouraging and, besides shedding light on the problem, may provide novel insights for all kinds of football stakeholders. 相似文献
20.
We review the results of six forecasting competitions based on the online data science platform Kaggle, which have been largely overlooked by the forecasting community. In contrast to the M competitions, the competitions reviewed in this study feature daily and weekly time series with exogenous variables, business hierarchy information, or both. Furthermore, the Kaggle data sets all exhibit higher entropy than the M3 and M4 competitions, and they are intermittent.In this review, we confirm the conclusion of the M4 competition that ensemble models using cross-learning tend to outperform local time series models and that gradient boosted decision trees and neural networks are strong forecast methods. Moreover, we present insights regarding the use of external information and validation strategies, and discuss the impacts of data characteristics on the choice of statistics or machine learning methods. Based on these insights, we construct nine ex-ante hypotheses for the outcome of the M5 competition to allow empirical validation of our findings. 相似文献