首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Recently, Feng-Jeng (Qual Quant 42:417–426, 2008) has proposed the nested estimation procedure as another alternative from the practical point of view of the problem of multicollinearity. Although the nested estimation procedure can promise to avoid multicollinearity, it can also avoid important information by eliminating variables. We are presenting another alternative called the raise method, which keeps all the information which could be highly recommended in some cases. We apply our proposal to a known example and compare the results with the nested estimation procedure, the ridge regression and the principal components.  相似文献   

2.
The purpose of this logistics research methods article is to empirically test and introduce correlated components regression (CCR) as a new statistical technique that will improve the accuracy and validity in testing logistics theoretical models and hypothesised relationships. The purpose of the current study is to use CCR analysis as technique to address multicollinearity. Customer satisfaction data with parcel carriers is analysed with using CCR and multiple regression. To determine the best regression model of these two approaches, cross-validation R2 values are used. In addition, comparisons are made to examine the standardised beta coefficients from both methods and to assess the possible impact from high levels of multicollinearity. Findings of the analysis suggest that CCR has a significantly higher cross-validation R2 value and thus is determined the best model of these two approaches.  相似文献   

3.
The detection of multicollinearity in econometric models is usualy based on the so-called condition number (CN) of the data matrix X. However, the computation of the CN, which is the greater condition index, gives misleading results in particular cases and many commercial computer packages produce an inflated CN, even in cases of spurious multicollinearity, i.e. even if no collinearity exists when the explanatory variables are considered. And this is due to the very low total variation of some columns of the transformed data matrix, which is used to compute CN. On the other hand, we may have the problem of latent multocollinearity which can be revealed by additionally computing a revised CN. With all these in mind, we figure out the ill-conditioned situations, suggesting some practical rules of thumb to face such problems using a single diagnostic in a fairly simple procedure. It is noted that this procedure is not mentioned in the relevant literature.  相似文献   

4.
Emergency Departments (EDs) can better manage activities and resources and anticipate overcrowding through accurate estimations of waiting times. However, the complex nature of EDs imposes a challenge on waiting time prediction. In this paper, we test various machine learning techniques, using predictive analytics, applied to two large datasets from real EDs. We evaluate the predictive ability of Lasso, Random Forest, Support Vector Regression, Artificial Neural Network, and the Ensemble Method, using different error metrics and computational times. To improve the prediction accuracy, new queue-based variables, that capture the current state of the ED, are defined as additional predictors. The results show that the Ensemble Method is the most effective at predicting waiting times. In terms of both accuracy and computational efficiency, Random Forest is a reasonable trade-off. The results have significant practical implications for EDs and hospitals, suggesting that a real-time performance monitoring system that supports operational decision-making is possible.  相似文献   

5.
自回归模型是一种被广泛使用的计量经济模型,但在实践中通常会引起多重共线性,从而导致外生解释变量的系数发生较大变化,一些变量不再显著,模型失去经济意义。本文首先分析这些现象是如何发生的,然后通过一个典型的例子给予说明,最后提出了改进模型的建议。  相似文献   

6.
Multicollinearity is one of the most important issues in regression analysis, as it produces unstable coefficients’ estimates and makes the standard errors severely inflated. The regression theory is based on specific assumptions concerning the set of error random variables. In particular, when errors are uncorrelated and have a constant variance, the ordinary least squares estimator produces the best estimates among all linear estimators. If, as often happens in reality, these assumptions are not met, other methods might give more efficient estimates and their use is therefore recommendable. In this paper, after reviewing and briefly describing the salient features of the methods, proposed in the literature, to determine and address the multicollinearity problem, we introduce the Lpmin method, based on Lp-norm estimation, an adaptive robust procedure that is used when the residual distribution has deviated from normality. The major advantage of this approach is that it produces more efficient estimates of the model parameters, for different degrees of multicollinearity, than those generated by the ordinary least squares method. A simulation study and a real-data application are also presented, in order to show the better results provided by the Lpmin method in the presence of multicollinearity.  相似文献   

7.
Since the introduction of the Basel II Accord, and given its huge implications for credit risk management, the modeling and prediction of the loss given default (LGD) have become increasingly important tasks. Institutions which use their own LGD estimates can build either simpler or more complex methods. Simpler methods are easier to implement and more interpretable, but more complex methods promise higher prediction accuracies. Using a proprietary data set of 1,184 defaulted corporate leases in Germany, this study explores different parametric, semi-parametric and non-parametric approaches that attempt to predict the LGD. By conducting the analyses for different information sets, we study how the prediction accuracy changes depending on the set of information that is available. Furthermore, we use a variable importance measure to identify the input variables that have the greatest effects on the LGD prediction accuracy for each method. In this regard, we provide new insights on the characteristics of leasing LGDs. We find that (1) more sophisticated methods, especially the random forest, lead to remarkable increases in the prediction accuracy; (2) updating information improves the prediction accuracy considerably; and (3) the outstanding exposure at default, an internal rating, asset types and lessor industries turn out to be important drivers of accurate LGD predictions.  相似文献   

8.
This paper provides a historical overview of financial crises and their origins. The objective is to discuss a few of the modern statistical methods that can be used to evaluate predictors of these rare events. The problem involves the prediction of binary events, and therefore fits modern statistical learning, signal processing theory, and classification methods. The discussion also emphasizes the need for statistics and computational techniques to be supplemented with economics. The success of a forecast in this environment hinges on the economic consequences of the actions taken as a result of the forecast, rather than on typical statistical metrics of prediction accuracy.  相似文献   

9.
Using a broad selection of 53 carbon (EUA) related, commodity and financial predictors, we provide a comprehensive assessment of the out-of-sample (OOS) predictability of weekly European carbon futures return. We assess forecast performance using both statistical and economic value metrics over an OOS period spanning from January 2013 to May 2018. Two main types of dimension reduction techniques are employed: (i) shrinkage of coefficient estimates and (ii) factor models. We find that: (1) these dimension reduction techniques can beat the benchmark significantly with positive gains in forecast accuracy, despite very few individual predictors being able to; (2) forecast accuracy is sensitive to the sample period, and only Group-average models and Commodity-predictors tend to beat the benchmark consistently; the Group-average models can improve both the prediction accuracy and stability significantly by averaging the predictions of All-predictors model and the benchmark. Further, to demonstrate the usefulness of forecasts to the end-user, we estimate the certainty equivalent gains (economic value) generated. Almost all dimension reduction techniques do well especially those which apply shrinkage alone. We find including All-predictors and Group-average variable sets achieve the highest economic gains and portfolio performance. Our main results are robust to alternative specifications.  相似文献   

10.
Correlation is an important statistical issue for the Ordinary Least Squares estimates and for data‐reduction techniques, such as the Factor and the Principal Components analyses. In this paper we propose new indicators for the multicollinearity problem in the multiple linear regression model.  相似文献   

11.
In the practical cases, we are usually faced with the more difficult problem of multicollinearity in our fitted regression model. Multicollinearity will arise when there are approximate linear relationships between two or more independent variables. It may cause some serious problems in validation, interpretation, and analysis of the model, such as unstable estimates, unreasonable sing, high-standard errors, and so on. Although there are some methods to solve or avoid this problem, we will propose another alternative from the practical view in this paper, called nested estimate procedure. The first half of the paper explains the concept and process of this procedure, and the second half provides two examples to illustrate this procedure’s suitability and reliability.  相似文献   

12.
遥感图像分类方法及研究进展   总被引:1,自引:1,他引:0  
李灏 《价值工程》2011,30(18):140-140
现代遥感技术迅猛发展,新的分类方法与传统的计算机分类法的共同使用,不仅可以更方便的识别地物提高遥感图像的分类精度,而且能够有效的避免失误、减少错漏现象的发生。  相似文献   

13.
The best guesses of unknown coefficients specified in Theil's model of introspection are like predictions and not like de Finetti's prevision and therefore not the values taken by random variables. Constrained least squares procedures can be formulated which are free of these difficulties. The ridge estimator is a simple version of a constrained least squares estimator which can be made operational even when little prior information is available. Our operational ridge estimators are nearly minimax and are not less stable than least squares in the presence of high multicollinearity. Finally, we have presented the ridge estimates for the Rotterdam demand model.  相似文献   

14.
High dimensional factor models can involve thousands of parameters. The Jacobian matrix for identification is of a large dimension. It can be difficult and numerically inaccurate to evaluate the rank of such a Jacobian matrix. We reduce the identification problem to a small rank problem, which is easy to check. The identification conditions allow both linear and nonlinear restrictions. Under reasonable assumptions for high dimensional factor models, the small rank conditions are shown to be necessary and sufficient for local identification.  相似文献   

15.
In this paper, the authors evaluate the effectiveness of Statement of Cash Flows measures in the classification and prediction of bankruptcy. The problems of biased estimators and bankruptcy probabilities and of optimal cut-off rates are addressed by using a large random sample and conditional marginal probability density functions. It is found that cashflow variables provide statistically better classification and prediction rates when used with traditional accounting variables when bankruptcy is defined as a Chapter 11 filing.  相似文献   

16.
The goal of our paper is to improve the accuracy of stock return forecasts by combining new technical indicators and a new two-step economic constraint forecasting model. Empirical results indicate the stock return forecasts generated by new technical indicators and new economic constraint forecasting model is statistically and economically significant both in-sample and out-of-sample prediction performance. In addition, the prediction performance of new technical indicators and new economic constraint forecasting model is robust for some extension and robustness analysis.  相似文献   

17.
Civil unrest can range from peaceful protest to violent furor, and researchers are working to monitor, forecast, and assess such events to allocate resources better. Twitter has become a real-time data source for forecasting civil unrest because millions of people use the platform as a social outlet. Daily word counts are used as model features, and predictive terms contextualize the reasons for the protest. To forecast civil unrest and infer the reasons for the protest, we consider the problem of Bayesian variable selection for the dynamic logistic regression model and propose using penalized credible regions to select parameters of the updated state vector. This method avoids the need for shrinkage priors, is scalable to high-dimensional dynamic data, and allows the importance of variables to vary in time as new information becomes available. A substantial improvement in both precision and F1-score using this approach is demonstrated through simulation. Finally, we apply the proposed model fitting and variable selection methodology to the problem of forecasting civil unrest in Latin America. Our dynamic logistic regression approach shows improved accuracy compared to the static approach currently used in event prediction and feature selection.  相似文献   

18.
基于AdaBoost的电信客户流失预测模型   总被引:1,自引:0,他引:1  
王纯麟  何建敏 《价值工程》2007,26(2):106-109
随着电信业改革的深入和竞争的不断加剧,各大电信企业的客户流失率逐步攀升。在深入分析电信业客户流失问题的基础上,针对目前研究多采用单分类器模型的不足,提出了一种基于组合分类器的电信客户流失预测模型。实证结果表明该模型能有效提升预测准确率,为今后的研究提供了新的研究思路。  相似文献   

19.
An important statistical application is the problem of determining an appropriate set of input variables for modelling a response variable. In such an application, candidate models are characterized by which input variables are included in the mean structure. A reasonable approach to gauging the propriety of a candidate model is to define a discrepancy function through the prediction error associated with this model. An optimal set of input variables is then determined by searching for the candidate model that minimizes the prediction error. In this paper, we focus on a Bayesian approach to estimating a discrepancy function based on prediction error in linear regression. It is shown how this approach provides an informative method for quantifying model selection uncertainty.  相似文献   

20.
成分数据由于存在定和约束和多重共线性,在进行传统的logistic回归建模时遇到了困难。本文在对称logratio变换和偏最小二乘logistic回归技术的基础上,提出了成分数据偏最小二乘logistic回归模型,较好地解决了成分数据的logistic回归建模问题。应用此方法,本文研究了中国三次产业就业结构与人民生活水平之间的关系。结果表明,该模型在有关结构问题的logistic回归建模中具有很好的适用性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号