首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The winning machine learning methods of the M5 Accuracy competition demonstrated high levels of forecast accuracy compared to the top-performing benchmarks in the history of the M-competitions. Yet, large-scale adoption is hampered due to the significant computational requirements to model, tune, and train these state-of-the-art algorithms. To overcome this major issue, we discuss the potential of transfer learning (TL) to reduce the computational effort in hierarchical forecasting and provide a proof of concept that TL can be applied on M5 top-performing methods. We demonstrate our easy-to-use TL framework on the recursive store-level LightGBM models of the M5 winning method and attain similar levels of forecast accuracy with roughly 25% less training time. Our findings provide evidence for a novel application of TL to facilitate the practical applicability of the M5 winning methods in large-scale settings with hierarchically structured data.  相似文献   

2.
The M5 competition follows the previous four M competitions, whose purpose is to learn from empirical evidence how to improve forecasting performance and advance the theory and practice of forecasting. M5 focused on a retail sales forecasting application with the objective to produce the most accurate point forecasts for 42,840 time series that represent the hierarchical unit sales of the largest retail company in the world, Walmart, as well as to provide the most accurate estimates of the uncertainty of these forecasts. Hence, the competition consisted of two parallel challenges, namely the Accuracy and Uncertainty forecasting competitions. M5 extended the results of the previous M competitions by: (a) significantly expanding the number of participating methods, especially those in the category of machine learning; (b) evaluating the performance of the uncertainty distribution along with point forecast accuracy; (c) including exogenous/explanatory variables in addition to the time series data; (d) using grouped, correlated time series; and (e) focusing on series that display intermittency. This paper describes the background, organization, and implementations of the competition, and it presents the data used and their characteristics. Consequently, it serves as introductory material to the results of the two forecasting challenges to facilitate their understanding.  相似文献   

3.
This paper proposes a hybrid ensemble forecasting methodology that integrating empirical mode decomposition (EMD), long short-term memory (LSTM) and extreme learning machine (ELM) for the monthly biofuel (a typical agriculture-related energy) production based on the principle of decomposition—reconstruction—ensemble. The proposed methodology involves four main steps: data decomposition via EMD, component reconstruction via a fine-to-coarse (FTC) method, individual prediction via LSTM and ELM algorithms, and ensemble prediction via a simple addition (ADD) method. For illustration and verification, the biofuel monthly production data of the USA is used as the our sample data, and the empirical results indicate that the proposed hybrid ensemble forecasting model statistically outperforms all considered benchmark models considered in terms of the forecasting accuracy. This indicates that the proposed hybrid ensemble forecasting methodology integrating the EMD-LSTM-ELM models based on the decomposition—reconstruction—ensemble principle has been proved to be a competitive model for the prediction of biofuel production.  相似文献   

4.
Organisational Learning (OL) is essential for the survival of an organisation and has led to a significant amount of conceptual and empirical studies. However, no attempt has yet been made to track the overall evolution of OL literature along with the inter-related concepts of learning organisation and organisational learning orientation. Therefore, the present study attempts to fill this gap and track the interdisciplinary flow of knowledge by applying a structural methodology called Systematic Literature Network Analysis (SLNA). The results reveal four main areas of investigation within the field: i) the fundamentals of OL; ii) OL in relation to managerial and economic variables; iii) management of learning organisation; iv) learning orientation in relation to managerial and economic variables. Furthermore, this review contributes by arranging the findings into a theoretical framework which is termed organisational learning chain. Based on the co-analysis of main themes and key concepts detected, the framework integrates and highlights the factors that influence learning performance in and by organisations. Finally, several further research avenues are discussed, and the benefits of the applied review methodology are highlighted.  相似文献   

5.
Given the advances in online data acquisition systems, statistical learning models are increasingly used to forecast wind speed. In electricity markets, wind farm production forecasts are needed for the day-ahead, intra-day, and real-time markets. In this work, we use a spatiotemporal model that leverages wind dynamics to forecast wind speed. Using a priori knowledge of the wind direction, we propose a maximum likelihood estimate of the inverse covariance matrix regularized with a hierarchical sparsity-inducing penalty. The resulting inverse covariance estimate not only exhibits the benefits of a sparse estimator, but also enables meaningful sparse structures by considering wind direction. A proximal method is used to solve the underlying optimization problem. The proposed methodology is used to forecast six-hour-ahead wind speeds in 20-minute time intervals for a case study in Texas. We compare our method with a number of other statistical methods. Prediction performance measures and the Diebold–Mariano test show the potential of the proposed method, specifically when reasonably accurate estimates of the wind directions are available.  相似文献   

6.
Since the advent of the horseshoe priors for regularisation, global–local shrinkage methods have proved to be a fertile ground for the development of Bayesian methodology in machine learning, specifically for high-dimensional regression and classification problems. They have achieved remarkable success in computation and enjoy strong theoretical support. Most of the existing literature has focused on the linear Gaussian case; for which systematic surveys are available. The purpose of the current article is to demonstrate that the horseshoe regularisation is useful far more broadly, by reviewing both methodological and computational developments in complex models that are more relevant to machine learning applications. Specifically, we focus on methodological challenges in horseshoe regularisation in non-linear and non-Gaussian models, multivariate models and deep neural networks. We also outline the recent computational developments in horseshoe shrinkage for complex models along with a list of available software implementations that allows one to venture out beyond the comfort zone of the canonical linear regression problems.  相似文献   

7.
Machine learning models are boosting Artificial Intelligence applications in many domains, such as automotive, finance and health care. This is mainly due to their advantage, in terms of predictive accuracy, with respect to classic statistical models. However, machine learning models are much less explainable: less transparent, less interpretable. This paper proposes to improve machine learning models, by proposing a model selection methodology, based on Lorenz Zonoids, which allows to compare them in terms of predictive accuracy significant gains, leading to a selected model which maintains accuracy while improving explainability. We illustrate our proposal by means of simulated datasets and of a real credit scoring problem. The analysis of the former shows that the proposal improves alternative methods, based on the AUROC. The analysis of the latter shows that the proposal leads to models made up of two/three relevant variables that measure the profitability and the financial leverage of the companies asking for credit.  相似文献   

8.
We review the results of six forecasting competitions based on the online data science platform Kaggle, which have been largely overlooked by the forecasting community. In contrast to the M competitions, the competitions reviewed in this study feature daily and weekly time series with exogenous variables, business hierarchy information, or both. Furthermore, the Kaggle data sets all exhibit higher entropy than the M3 and M4 competitions, and they are intermittent.In this review, we confirm the conclusion of the M4 competition that ensemble models using cross-learning tend to outperform local time series models and that gradient boosted decision trees and neural networks are strong forecast methods. Moreover, we present insights regarding the use of external information and validation strategies, and discuss the impacts of data characteristics on the choice of statistics or machine learning methods. Based on these insights, we construct nine ex-ante hypotheses for the outcome of the M5 competition to allow empirical validation of our findings.  相似文献   

9.
We deal with general mixture of hierarchical models of the form m(x) = føf(x |θ) g (θ)dθ , where g(θ) and m(x) are called mixing and mixed or compound densities respectively, and θ is called the mixing parameter. The usual statistical application of these models emerges when we have data xi, i = 1,…,n with densities f(xii) for given θi, and the θ1 are independent with common density g(θ) . For a certain well known class of densities f(x |θ) , we present a sample-based approach to reconstruct g(θ) . We first provide theoretical results and then we use, in an empirical Bayes spirit, the first four moments of the data to estimate the first four moments of g(θ) . By using sampling techniques we proceed in a fully Bayesian fashion to obtain any posterior summaries of interest. Simulations which investigate the operating characteristics of our proposed methodology are presented. We illustrate our approach using data from mixed Poisson and mixed exponential densities.  相似文献   

10.
This article introduces the winning method at the M5 Accuracy competition. The presented method takes a simple manner of averaging the results of multiple base forecasting models that have been constructed via partial pooling of multi-level data. All base forecasting models of adopting direct or recursive multi-step forecasting methods are trained by the machine learning technique, LightGBM, from three different levels of data pools. At the competition, the simple averaging of the multiple direct and recursive forecasting models, called DRFAM, obtained the complementary effects between direct and recursive multi-step forecasting of the multi-level product sales to improve the accuracy and the robustness.  相似文献   

11.
A decomposition clustering ensemble (DCE) learning approach is proposed for forecasting foreign exchange rates by integrating the variational mode decomposition (VMD), the self-organizing map (SOM) network, and the kernel extreme learning machine (KELM). First, the exchange rate time series is decomposed into N subcomponents by the VMD method. Second, each subcomponent series is modeled by the KELM. Third, the SOM neural network is introduced to cluster the subcomponent forecasting results of the in-sample dataset to obtain cluster centers. Finally, each cluster's ensemble weight is estimated by another KELM, and the final forecasting results are obtained by the corresponding clusters' ensemble weights. The empirical results illustrate that our proposed DCE learning approach can significantly improve forecasting performance, and statistically outperform some other benchmark models in directional and level forecasting accuracy.  相似文献   

12.
The efficient flow of goods and services involves addressing multilevel forecast questions, and careful consideration when aggregating or disaggregating hierarchical estimates. Assessing all possible aggregation alternatives helps to determine the statistically most accurate way of consolidating multilevel forecasts. However, doing so in a multilevel and multiproduct supply chain may prove to be a very computationally intensive and time-consuming task. In this paper, we present a new, two-level oblique linear discriminant tree model, which identifies the optimal hierarchical forecast technique for a given hierarchical database in a very time-efficient manner. We induced our model from a real-world dataset, and it separates all historical time series into the four aggregation mechanisms considered. The separation process is a function of both the positive and negative correlation groups' variances at the lowest level of the hierarchical datasets. Our primary contributions are: (1) establishing a clear-cut relationship between the correlation metrics at the lowest level of the hierarchy and the optimal aggregation mechanism for a product/service hierarchy, and (2) developing an analytical model for personalized forecast aggregation decisions, based on characteristics of a hierarchical dataset.  相似文献   

13.
In this paper, I interpret a time series spatial model (T-SAR) as a constrained structural vector autoregressive (SVAR) model. Based on these restrictions, I propose a minimum distance approach to estimate the (row-standardized) network matrix and the overall network influence parameter of the T-SAR from the SVAR estimates. I also develop a Wald-type test to assess the distance between these two models. To implement the methodology, I discuss machine learning methods as one possible identification strategy of SVAR models. Finally, I illustrate the methodology through an application to volatility spillovers across major stock markets using daily realized volatility data for 2004–2018.  相似文献   

14.
Fraud problems in loan application assessment cause significant losses for finance companies worldwide, and much research has focused on machine learning methods to improve the efficacy of fraud detection in some financial domains. However, diverse information falsification in individual fraud remains one of the most challenging problems in loan applications. To this end, we conducted an empirical study to explore the relationships between various fraud types and analyzed the factors influencing information fabrication. Weak relationships exist among different falsification types, and some essential factors play the same roles in different fraud types. In contrast, others have various or opposing effects on these types of frauds. Based on this finding, we propose a novel hierarchical multi-task learning approach to refine fraud-detection systems. Specifically, we first developed a hierarchical fraud category method to break down this problem into several subtasks according to the information types falsified by customers, reducing fraud identification's difficulty. Second, a heterogeneous network with a meta-path-based random walk and heterogeneous skip-gram model can solve the representation learning problem owing to the sophisticated relationships among the applicants' information. Furthermore, the final subtasks can be predicted using a multi-task learning approach with two prediction layers. The first layer provides the probabilities of general fraud categories as auxiliary information for the second layer, which is for specific subtask prediction. Finally, we conducted extensive experiments based on a real-world dataset to demonstrate the effectiveness of the proposed approach.  相似文献   

15.
One of the most successful forecasting machine learning (ML) procedures is random forest (RF). In this paper, we propose a new mixed RF approach for modeling departures from linearity that helps identify (i) explanatory variables with nonlinear impacts, (ii) threshold values, and (iii) the closest parametric approximation. The methodology is applied to weekly forecasts of gasoline prices, cointegrated with international oil prices and exchange rates. Recent specifications for nonlinear error correction (NEC) models include threshold autoregressive models (TAR) and double-threshold smooth transition autoregressive (STAR) models. We propose a new mixed RF model specification strategy and apply it to the determinants of weekly prices of the Spanish gasoline market from 2010 to 2019. In particular, the mixed RF is able to identify nonlinearities in both the error correction term and the rate of change of oil prices. It provides the best weekly gasoline price forecasting performance and supports the logistic error correction model (ECM) approximation.  相似文献   

16.
Appropriate modelling of Likert‐type items should account for the scale level and the specific role of the neutral middle category, which is present in most Likert‐type items that are in common use. Powerful hierarchical models that account for both aspects are proposed. To avoid biased estimates, the models separate the neutral category when modelling the effects of explanatory variables on the outcome. The main model that is propagated uses binary response models as building blocks in a hierarchical way. It has the advantage that it can be easily extended to include response style effects and non‐linear smooth effects of explanatory variables. By simple transformation of the data, available software for binary response variables can be used to fit the model. The proposed hierarchical model can be used to investigate the effects of covariates on single Likert‐type items and also for the analysis of a combination of items. For both cases, estimation tools are provided. The usefulness of the approach is illustrated by applying the methodology to a large data set.  相似文献   

17.
N.J. Smith  A.P. Sage 《Socio》1973,7(5):545-569
A critical problem in urban modeling is the validation of system models and identification of system parameters within an assumed structure. This paper applies recent developments in system identification in hierarchical structure to identification of system parameters for two models of urban dynamics.  相似文献   

18.
Deep neural networks and gradient boosted tree models have swept across the field of machine learning over the past decade, producing across-the-board advances in performance. The ability of these methods to capture feature interactions and nonlinearities makes them exceptionally powerful and, at the same time, prone to overfitting, leakage, and a lack of generalization in domains with target non-stationarity and collinearity, such as time-series forecasting. We offer guidance to address these difficulties and provide a framework that maximizes the chances of predictions that generalize well and deliver state-of-the-art performance. The techniques we offer for cross-validation, augmentation, and parameter tuning have been used to win several major time-series forecasting competitions—including the M5 Forecasting Uncertainty competition and the Kaggle COVID19 Forecasting series—and, with the proper theoretical grounding, constitute the current best practices in time-series forecasting.  相似文献   

19.
We present our solution for the M5 Uncertainty competition. Our solution ranked sixth out of 909 submissions across all hierarchical levels and ranked first for prediction at the finest level of granularity (product-store sales, i.e. SKUs). The model combines a multi-stage state-space model and Monte Carlo simulations to generate the forecasting scenarios (trajectories). Observed sales are modelled with negative binomial distributions to represent discrete over-dispersed sales. Seasonal factors are handcrafted and modelled with linear coefficients that are calculated at the store-department level.  相似文献   

20.
We consider a semiparametric method to estimate logistic regression models with missing both covariates and an outcome variable, and propose two new estimators. The first, which is based solely on the validation set, is an extension of the validation likelihood estimator of Breslow and Cain (Biometrika 75:11–20, 1988). The second is a joint conditional likelihood estimator based on the validation and non-validation data sets. Both estimators are semiparametric as they do not require any model assumptions regarding the missing data mechanism nor the specification of the conditional distribution of the missing covariates given the observed covariates. The asymptotic distribution theory is developed under the assumption that all covariate variables are categorical. The finite-sample properties of the proposed estimators are investigated through simulation studies showing that the joint conditional likelihood estimator is the most efficient. A cable TV survey data set from Taiwan is used to illustrate the practical use of the proposed methodology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号