首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A common strategy within the framework of regression models is the selection of variables with possible predictive value, which are incorporated in the regression model. Two recently proposed methods, Breiman's Garotte (B reiman , 1995) and Tibshirani's Lasso (T ibshirani , 1996) try to combine variable selection and shrinkage. We compare these with pure variable selection and shrinkage procedures. We consider the backward elimination procedure as a typical variable selection procedure and as an example of a shrinkage procedure an approach of V an H ouwelingen and L e C essie (1990). Additionally an extension of van Houwelingens and le Cessies approach proposed by S auerbrei (1999) is considered. The ordinary least squares method is used as a reference.
With the help of a simulation study we compare these approaches with respect to the distribution of the complexity of the selected model, the distribution of the shrinkage factors, selection bias, the bias and variance of the effect estimates and the average prediction error.  相似文献   

2.
We consider the estimation and hypothesis testing problems for the partial linear regression models when some variables are distorted with errors by some unknown functions of commonly observable confounding variable. The proposed estimation procedure is designed to accommodate undistorted as well as distorted variables. To test a hypothesis on the parametric components, a restricted least squares estimator is proposed under the null hypothesis. Asymptotic properties for the estimators are established. A test statistic based on the difference between the residual sums of squares under the null and alternative hypotheses is proposed, and we also obtain the asymptotic properties of the test statistic. A wild bootstrap procedure is proposed to calculate critical values. Simulation studies are conducted to demonstrate the performance of the proposed procedure, and a real example is analyzed for an illustration.  相似文献   

3.
Many popular methods of model selection involve minimizing a penalized function of the data (such as the maximized log-likelihood or the residual sum of squares) over a set of models. The penalty in the criterion function is controlled by a penalty multiplier λ which determines the properties of the procedure. In this paper, we first review model selection criteria of the simple form “Loss + Penalty” and then propose studying such model selection criteria as functions of the penalty multiplier. This approach can be interpreted as exploring the stability of model selection criteria through what we call model selection curves. It leads to new insights into model selection and new proposals on how to select models. We use the bootstrap to enhance the basic model selection curve and develop convenient numerical and graphical summaries of the results. The methodology is illustrated on two data sets and supported by a small simulation. We show that the new methodology can outperform methods such as AIC and BIC which correspond to single points on a model selection curve.  相似文献   

4.
This paper presents recursion formulae for the two-stage least-squares estimators of the structural coefficients in a simultaneous equation model and for the residual sum of squares used in estimating the asymptotic covariance matrix. Included are formulae for updating estimates when a new set of observations is obtained and for revising estimates when a set of observations is discarded. The recursion formulae should prove to be of both practical and theoretical interest to econometricians.  相似文献   

5.
黄靖 《价值工程》2014,(24):67-69
利用SWC-150 Fredlund土-水特征曲线压力仪,对不同干密度的砾石土进行土-水特征试验,探讨不同干密度下土-水特征曲线(SWCC)的变化规律;采用四种不同的拟合方程对试验所得的土-水特征曲线通过最小二乘法进行拟合,获得了拟合参数及残差平方和。通过残差平方和剂曲线形状对比,Fredlund&Xing四参数方程的拟合效果最好。  相似文献   

6.
A Caution Regarding Rules of Thumb for Variance Inflation Factors   总被引:22,自引:0,他引:22  
The Variance Inflation Factor (VIF) and tolerance are both widely used measures of the degree of multi-collinearity of the ith independent variable with the other independent variables in a regression model. Unfortunately, several rules of thumb – most commonly the rule of 10 – associated with VIF are regarded by many practitioners as a sign of severe or serious multi-collinearity (this rule appears in both scholarly articles and advanced statistical textbooks). When VIF reaches these threshold values researchers often attempt to reduce the collinearity by eliminating one or more variables from their analysis; using Ridge Regression to analyze their data; or combining two or more independent variables into a single index. These techniques for curing problems associated with multi-collinearity can create problems more serious than those they solve. Because of this, we examine these rules of thumb and find that threshold values of the VIF (and tolerance) need to be evaluated in the context of several other factors that influence the variance of regression coefficients. Values of the VIF of 10, 20, 40, or even higher do not, by themselves, discount the results of regression analyses, call for the elimination of one or more independent variables from the analysis, suggest the use of ridge regression, or require combining of independent variable into a single index.  相似文献   

7.
The ‘Tobit’ model is a useful tool for estimation of regression models with truncated or limited dependent variables, but it requires a threshold which is either a known constant or an observable and independent variable. The model presented here extends the Tobit model to the censored case where the threshold is an unobserved and not necessarily independent random variable. Maximum likelihood procedures can be employed for joint estimation of both the primary regression equation and the parameters of the distribution of that random threshold.  相似文献   

8.
There are many environments where knowledge of a structural relationship is required to answer questions of interest. Also, nonseparability of a structural disturbance is a key feature of many models. Here, we consider nonparametric identification and estimation of a model that is monotonic in a nonseparable scalar disturbance, which disturbance is independent of instruments. This model leads to conditional quantile restrictions. We give local identification conditions for the structural equations from those quantile restrictions. We find that a modified completeness condition is sufficient for local identification. We also consider estimation via a nonparametric minimum distance estimator. The estimator minimizes the sum of squares of predicted values from a nonparametric regression of the quantile residual on the instruments. We show consistency of this estimator.  相似文献   

9.
An exposition of the missing plot technique often applied in analysis of variance is given in very general terms.
Non-orthogonality mostly implies heavy computations. If the scheme of observations is almost orthogonal this technique, however, supplies in a simple way unbiassed and efficient estimates of the expectation values which occur in a linear hypothesis underlying an analysis of variance. Moreover the correct residual sum of squares required for a test or a confidence interval estimation is obtained without difficulty.
A correct test of an effect or an interaction will be provided by two estimates, the first under the null-hypothesis, the second under the alternative hypothesis. In the case of non-orthogonality this may imply two separate applications of the discussed technique. The difference between the two residual sums of squares will be used for the numerator of a valid F-criterion.
The technique is illustrated by an example.  相似文献   

10.
This paper incorporates text data from MLS listings into a hedonic pricing model. We show that the comments section of the MLS, which is populated by real estate agents who arguably have the most local market knowledge and know what homebuyers value, provides information that improves the performance of both in‐sample and out‐of‐sample pricing estimates. Text is found to decrease pricing error by more than 25%. Information from text is incorporated into a linear model using a tokenization approach. By doing so, the implicit prices for various words and phrases are estimated. The estimation focuses on simultaneous variable selection and estimation for linear models in the presence of a large number of variables using a penalized regression. The LASSO procedure and variants are shown to outperform least‐squares in out‐of‐sample testing. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

11.
This paper examines limited-dependent rational expectations (LD-RE) models containing future expectations of the dependent variable. Limited dependence is of a two-limit tobit variety which may, for example, arise as a result of a policy of imposing limits on the movement of the dependent variable by means of marginal as well as intramarginal interventions. We show that when the forcing variables are serially independent the model has an analytical solution which can be computed by backward recursion. With serially correlated forcing variables, we discuss an approximate solution method, as well as a numerically exact method that, in principle, can be implemented by stochastic simulation, although in practice it is limited by available computational capacity. The paper discusses some properties of the approximate solutions and reports the results of a limited number of Monte Carlo experiments in order to illustrate the computational feasibility of using the exact solution when the fundamentals are serially independent and the approximate solution when they are serially correlated.  相似文献   

12.
The aim of this paper is to convey to a wider audience of applied statisticians that nonparametric (matching) estimation methods can be a very convenient tool to overcome problems with endogenous control variables. In empirical research one is often interested in the causal effect of a variable X on some outcome variable Y . With observational data, i.e. in the absence of random assignment, the correlation between X and Y generally does not reflect the treatment effect but is confounded by differences in observed and unobserved characteristics. Econometricians often use two different approaches to overcome this problem of confounding by other characteristics. First, controlling for observed characteristics, often referred to as selection on observables, or instrumental variables regression, usually with additional control variables. Instrumental variables estimation is probably the most important estimator in applied work. In many applications, these control variables are themselves correlated with the error term, making ordinary least squares and two-stage least squares inconsistent. The usual solution is to search for additional instrumental variables for these endogenous control variables, which is often difficult. We argue that nonparametric methods help to reduce the number of instruments needed. In fact, we need only one instrument whereas with conventional approaches one may need two, three or even more instruments for consistency. Nonparametric matching estimators permit     consistent estimation without the need for (additional) instrumental variables and permit arbitrary functional forms and treatment effect heterogeneity.  相似文献   

13.
Dr. Klaus Abt 《Metrika》1967,12(1):1-15
Summary Methods for the identification of the significant independent variables in multiple linear regression and in the multiple regression approach to non-orthogonal analysis of variance and covariance are discussed. “Forward Ranking” and “Backward Ranking” (by order of importance) of the independent variables are defined, and the backward method is shown to avoid the disadvantageous effects of “Compounds” upon the ranking. For non-orthogonal analysis of variance, a unique orthogonal decomposition of the regression sum of squares (due to all ANOVA effects) is shown to be possible when the groups of independent variables (representing the effects) are ranked by the criterion of “Non-Significance” and under “Restricted Admissibility.” A computer program is outlined which incorporates the proposed methods.
Zusammenfassung Methoden für die Identifizierung der signifikanten unabh?ngigen Ver?nderlichen in der mehrfachen linearen Regressionsrechnung und im Regressionsverfahren für nichtorthogonale Varianz- und Kovarianzanalyse werden besprochen. „Vorw?rtsgerichtetes“ und „rückw?rtsgerichtetes“ Rangordnen (nach Bedeutung) der unabh?ngigen Ver?nderlichen werden definiert, und es wird gezeigt, da? beim rückw?rtsgerichteten Rangordnen die nachteiligen Wirkungen von „Verb?nden“ auf das Ordnen vermieden werden. Für den Fall der nichtorthogonalen Varianzanalyse wird gezeigt, da? eine eindeutige orthogonale Zerlegung der Quadratsumme für die Regression (erkl?rt durch die Gesamtheit der Haupt- und Wechselwirkungen in der Varianzanalyse) erreicht werden kann, wenn die Gruppen der unabh?ngigen Ver?nderlichen, die die Haupt- und Wechselwirkungen repr?sentieren, nach dem Rangordnungskriterium „Nicht-Signifikanz“ und unter „Beschr?nkter Zul?ssigkeit“ geordnet werden. Ein Rechenprogramm wird erl?utert, welches auf den vorgeschlagenen Methoden basiert.
  相似文献   

14.
Despite their high predictive performance, random forest and gradient boosting are often considered as black boxes which has raised concerns from practitioners and regulators. As an alternative, we suggest using partial linear models that are inherently interpretable. Specifically, we propose to combine parametric and non-parametric functions to accurately capture linearities and non-linearities prevailing between dependent and explanatory variables, and a variable selection procedure to control for overfitting issues. Estimation relies on a two-step procedure building upon the double residual method. We illustrate the predictive performance and interpretability of our approach on a regression problem.  相似文献   

15.
本文质疑联立方程模型前定变量的工具变量性质:前定变量并不保证与当期行为解释变量的相关性,由其构建的工作回归元所完成的分阶段最小二乘估计因而并非两阶段最小二乘估计。建议按照简约式方程构建工作回归元,其具有模型数理逻辑支持下的可替代意义。工作回归元的不同导致结构式方程分阶段最小二乘估计的不同结果,之于恰好识别方程则揭示了业内关于间接最小二乘估计方法的一个误区。  相似文献   

16.
The sample mean is one of the most natural estimators of the population mean based on independent identically distributed sample. However, if some control variate is available, it is known that the control variate method reduces the variance of the sample mean. The control variate method often assumes that the variable of interest and the control variable are i.i.d. Here we assume that these variables are stationary processes with spectral density matrices, i.e. dependent. Then we propose an estimator of the mean of the stationary process of interest by using control variate method based on nonparametric spectral estimator. It is shown that this estimator improves the sample mean in the sense of mean square error. Also this analysis is extended to the case when the mean dynamics is of the form of regression. Then we propose a control variate estimator for the regression coefficients which improves the least squares estimator (LSE). Numerical studies will be given to see how our estimator improves the LSE.  相似文献   

17.
Financial market participants are interested in knowing what events can alter the volatility pattern of financial assets and how unanticipated shocks determine the persistence of volatility over time. The present paper studies these issues by detecting time periods of sudden changes in volatility by using the iterated cumulated sums of squares (ICSS) algorithm. Examining five major sectors from January 1992 to August 2003, we found that accounting for volatility shifts in the standard GARCH model considerably reduces the estimated volatility persistence. Our results have important implications regarding asset pricing, risk management, and portfolio selection. (JEL G110, G120)  相似文献   

18.
In Davidson and MacKinnon (1981), two of the present authors proposed a novel and very simple procedure for testing the specification of a nonlinear regression model against the evidence provided by a non-nested alternative. In this paper we extend their results in several directions. First, we relax a number of the assumptions of the previous paper; we admit the possibility that the nonlinear regression functions may depend on lagged dependent variables, and we do not require that the error terms be normally distributed. Second, we show how the earlier procedure may straightforwardly be generalized to the case where the two non-nested models involve different transformations of the dependent variable. Finally, we propose a simple procedure for testing non-nested linear regression models which have endogenous variables on the right-hand side, and have therefore been estimated by two-stage least squares.  相似文献   

19.
本文采用偏最小二乘回归模型(PLS),以泰国菠萝贸易为例,通过变量投影重要性准则筛选自变量,由交叉有效性提取主成分,进而建立偏最小二乘回归模型。深入分析了各指标对泰国菠萝出口贸易的影响。研究表明泰国菠萝出口与原料价格及工厂生产加工速度密切相关,并且偏最小二乘回归的拟合效果优于普通最小二乘回归。  相似文献   

20.
In a multiple linear regression model with one mismeasured independent variable, all coefficients are asymptotically biased. It is shown how in OLS, an examination of the sign of the cofactors of the variance-covariance matrix of measured values can be used to obtain large sample bounds on the coefficients. The method involves forward regression and regression on the mismeasured variable. Bounds are generally obtained on the coefficient of the mismeasured variable and often obtained on the remaining coefficients with no knowledge of the size of the measurement error.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号