首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 406 毫秒
1.
In this article, we analyze the omitted variable bias problem in the multinomial logistic probability model. Sufficient, as well as necessary, conditions under which the omitted variable will not create asymptotically biased coefficient estimates for the included variables are derived. Conditional on the response variable, if the omitted explanatory and the included explanatory variable are independent, the bias will not occur. Bias will occur if the omitted relevant variable is independent with the included explanatory variable. The coefficient of the included variable plays an important role in the direction of the bias.  相似文献   

2.
A bstract . A multi-factor model includes economic, apprehension, seasonal a n d plant closing variables as the explanatory regressors and crimes against property as the dependent variable. Different lag structures were used on the explanatory variables such as an Almon distributed lag of a second degree polynominal nature and the lagging of the dependent variable by one quarter so that the model would more closely approximate the environment being considered. The results suggest a definite seasonal pattern in crimes against property, and the economic variables measuring local, not national, conditions, appear to be more significant regressors than any other explanatory variables.  相似文献   

3.
We study the effect, upon linear regression, of explicit selection on the dependent variable. If the explanatory variables are multinormally distributed along with the dependent variable, then the regression coefficient vector in the selected population is a scalar multiple of that in the original population.  相似文献   

4.
Ratio type financial indicators are the most popular explanatory variables in bankruptcy prediction models. These measures often exhibit heavily skewed distribution because of the presence of outliers. In the absence of clear definition of outliers, ad hoc approaches can be found in the literature for identifying and handling extreme values. However, it is not clear how these different approaches can affect the predictive power of models. There seems to be consensus in the literature on the necessity of handling outliers, at the same time, it is not clear how to define extreme values to be handled in order to maximize the predictive power of models. There are two possible ways to reduce the bias originating from outliers: omission and winsorization. Since the first approach has been examined previously in the literature, we turn our attention to the latter. We applied the most popular classification methodologies in this field: discriminant analysis, logistic regression, decision trees (CHAID and CART) and neural networks (multilayer perceptron). We assessed the predictive power of models in the framework of tenfold stratified crossvalidation and area under the ROC curve. We analyzed the effect of winsorization at 1, 3 and 5% and at 2 and 3 standard deviations, furthermore we discretized the range of each variable by the CHAID method and used the ordinal measures so obtained instead of the original financial ratios. We found that this latter data preprocessing approach is the most effective in the case of our dataset. In order to check the robustness of our results, we carried out the same empirical research on the publicly available Polish bankruptcy dataset from the UCI Machine Learning Repository. We obtained very similar results on both datasets, which indicates that the CHAID-based categorization of financial ratios is an effective way of handling outliers with respect to the predictive performance of bankruptcy prediction models.  相似文献   

5.
Regression analyses of cross-country economic growth data are complicated by two main forms of model uncertainty: the uncertainty in selecting explanatory variables and the uncertainty in specifying the functional form of the regression function. Most discussions in the literature address these problems independently, yet a joint treatment is essential. We present a new framework that makes such a joint treatment possible, using flexible nonlinear models specified by Gaussian process priors and addressing the variable selection problem by means of Bayesian model averaging. Using this framework, we extend the linear model to allow for parameter heterogeneity of the type suggested by new growth theory, while taking into account the uncertainty in selecting explanatory variables. Controlling for variable selection uncertainty, we confirm the evidence in favor of parameter heterogeneity presented in several earlier studies. However, controlling for functional form uncertainty, we find that the effects of many of the explanatory variables identified in the literature are not robust across countries and variable selections.  相似文献   

6.
In this paper we suggest a methodology to formulate a dynamic regression with variables observed at different time intervals. This methodology is applicable if the explanatory variables are observed more frequently than the dependent variable. We demonstrate this procedure by developing a forecasting model for Singapore's quarterly GDP based on monthly external trade. Apart from forecasts, the model provides a monthly distributed lag structure between GDP and external trade, which is not possible with quarterly data.  相似文献   

7.
We consider the following problem. There is a structural equation of interest that contains an explanatory variable that theory predicts is endogenous. There are one or more instrumental variables that credibly are exogenous with regard to this structural equation, but which have limited explanatory power for the endogenous variable. Further, there is one or more potentially ‘strong’ instruments, which has much more explanatory power but which may not be exogenous. Hausman (1978) provided a test for the exogeneity of the second instrument when none of the instruments are weak. Here, we focus on how the standard Hausman test does in the presence of weak instruments using the Staiger–Stock asymptotics. It is natural to conjecture that the standard version of the Hausman test would be invalid in the weak instrument case, which we confirm. However, we provide a version of the Hausman test that is valid even in the presence of weak IV and illustrate how to implement the test in the presence of heteroskedasticity. We show that the situation we analyze occurs in several important economic examples. Our Monte Carlo experiments show that our procedure works relatively well in finite samples. We should note that our test is not consistent, although we believe that it is impossible to construct a consistent test with weak instruments.  相似文献   

8.
In this paper three statistics and three discrepancy measures with which homogeneity in the random intercept model can be investigated will be evaluated. The first two can be used to test the homogeneity of level one residual variances across level two units and the third can be used to test whether effects should be fixed or random. Each statistic and discrepancy measure will be evaluated using asymptotic (if available), posterior predictive and plug in p -values. A simulation study will be used to investigate the frequency properties of these p -values. In the discussion it will be indicated how the results obtained for the random intercept model with one explanatory variable can be useful during the construction of general two level models.  相似文献   

9.
We consider estimation of panel data models with sample selection when the equation of interest contains endogenous explanatory variables as well as unobserved heterogeneity. Assuming that appropriate instruments are available, we propose several tests for selection bias and two estimation procedures that correct for selection in the presence of endogenous regressors. The tests are based on the fixed effects two-stage least squares estimator, thereby permitting arbitrary correlation between unobserved heterogeneity and explanatory variables. The first correction procedure is parametric and is valid under the assumption that the errors in the selection equation are normally distributed. The second procedure estimates the model parameters semiparametrically using series estimators. In the proposed testing and correction procedures, the error terms may be heterogeneously distributed and serially dependent in both selection and primary equations. Because these methods allow for a rather flexible structure of the error variance and do not impose any nonstandard assumptions on the conditional distributions of explanatory variables, they provide a useful alternative to the existing approaches presented in the literature.  相似文献   

10.
Optimal designs under a survival analysis framework have been rarely considered in the literature. In this paper, an optimal design theory is developed for the typical Cox regression problem. Failure time is modeled according to a probability distribution depending on some explanatory variables through a linear model. At the end of the study, some units will not have failed and thus their time records will be censored. In order to deal with this problem from an experimental design point of view it will be necessary to assume a probability distribution for the time an experimental unit enters the study. Then an optimal conditional design will be computed at the beginning of the study for any possible given time. Thus, every time a new unit enters the study, there is an experimental design to be determined. A particular and simple case is used throughout the paper in order to illustrate the procedure.  相似文献   

11.
A recent article by Krause (Qual Quant, doi:10.1007/s11135-012-9712-5, Krause (2012)) maintains that: (1) it is untenable to characterize the error term in multiple regression as simply an extraneous random influence on the outcome variable, because any amount of error implies the possibility of one or more omitted, relevant explanatory variables; and (2) the only way to guarantee the prevention of omitted variable bias and thereby justify causal interpretations of estimated coefficients is to construct fully specified models that completely eliminate the error term. The present commentary argues that such an extreme position is impractical and unnecessary, given the availability of specialized techniques for dealing with the primary statistical consequence of omitted variables, namely endogeneity, or the existence of correlations between included explanatory variables and the error term. In particular, the current article discusses the method of instrumental variable estimation, which can resolve the endogeneity problem in causal models where one or more relevant explanatory variables are excluded, thus allowing for accurate estimation of effects. An overview of recent methodological resources and software for conducting instrumental variables estimation is provided, with the aim of helping to place this crucial technique squarely in the statistical toolkit of applied researchers.  相似文献   

12.
Polytomous logistic regression   总被引:1,自引:0,他引:1  
In this paper a review will be given of some methods available for modelling relationships between categorical response variables and explanatory variables. These methods are all classed under the name polytomous logistic regression (PLR). Models for PLR will be presented and compared; model parameters will be tested and estimated by weighted least squares and by likelihood. Usually, software is needed for computation, and available statistical software is reported.
An industrial problem is solved to some extent as an example to illustrate the use of PLR. The paper is concluded by a discussion on the various PLR-methods and some topics that need a further study are mentioned.  相似文献   

13.
This paper considers a linear triangular simultaneous equations model with conditional quantile restrictions. The paper adjusts for endogeneity by adopting a control function approach and presents a simple two-step estimator that exploits the partially linear structure of the model. The first step consists of estimation of the residuals of the reduced-form equation for the endogenous explanatory variable. The second step is series estimation of the primary equation with the reduced-form residual included nonparametrically as an additional explanatory variable. This paper imposes no functional form restrictions on the stochastic relationship between the reduced-form residual and the disturbance term in the primary equation conditional on observable explanatory variables. The paper presents regularity conditions for consistency and asymptotic normality of the two-step estimator. In addition, the paper provides some discussions on related estimation methods in the literature.  相似文献   

14.
On the Practice of Lagging Variables to Avoid Simultaneity   总被引:1,自引:0,他引:1       下载免费PDF全文
A common practice in applied economics research consists of replacing a suspected simultaneously determined explanatory variable with its lagged value. This note demonstrates that this practice does not enable one to avoid simultaneity bias. The associated estimates are still inconsistent, and hypothesis testing is invalid. An alternative is to use lagged values of the endogenous variable in instrumental variable estimation. However, this is only an effective estimation strategy if the lagged values do not themselves belong in the respective estimating equation, and if they are sufficiently correlated with the simultaneously determined explanatory variable.  相似文献   

15.
Asymptotic theory for nonparametric regression with spatial data   总被引:1,自引:0,他引:1  
Nonparametric regression with spatial, or spatio-temporal, data is considered. The conditional mean of a dependent variable, given explanatory ones, is a nonparametric function, while the conditional covariance reflects spatial correlation. Conditional heteroscedasticity is also allowed, as well as non-identically distributed observations. Instead of mixing conditions, a (possibly non-stationary) linear process is assumed for disturbances, allowing for long range, as well as short-range, dependence, while decay in dependence in explanatory variables is described using a measure based on the departure of the joint density from the product of marginal densities. A basic triangular array setting is employed, with the aim of covering various patterns of spatial observation. Sufficient conditions are established for consistency and asymptotic normality of kernel regression estimates. When the cross-sectional dependence is sufficiently mild, the asymptotic variance in the central limit theorem is the same as when observations are independent; otherwise, the rate of convergence is slower. We discuss the application of our conditions to spatial autoregressive models, and models defined on a regular lattice.  相似文献   

16.
The paper examines the circumstances under which an equation with a composite MA disturbance term can be consistently estimated by single-equation non-linear least squares. The composite disturbance may arise as a result of the dependent or an explanatory variable being unobservable either because it is subject to measurement error or because it is a ‘desired’ or ‘expected’ variable. It may also arise as a result of substituting out an explanatory variable from an equation or because a vector MA process is specified for the structural form of a linear simultaneous equation model. The argument of the paper relies on some results relating to the sum of finite MA processes presented in Nelson (1975a) and Darroch and McDonald (1981).  相似文献   

17.
The search for a measurable link between HR practices and organizational performance is currently preoccupying HR professionals, consultants, government and academics. Empirical research on this human resource management-performance (HRM-P) link is, however, marred by a serious problem: it is under-theorized. While some (but by no means all) researchers on the HRM-P link are aware of the problem, none are prepared to face up to the scale of the implications. Without theory, research on the HRM-P link lacks explanatory power. The only ‘solution’ on offer (the assertion that theory will develop via more and/or better empirical work) has been less than successful: empirical research has multiplied with little or no theoretical development. Nor can it. The under-theorization and lack of explanatory power is rooted in the ‘scientific’ perspective that underpins empirical research. The paper draws upon critical realist philosophy to reveal exactly why the ‘scientific’ approach encourages under-theorization and lack of explanatory power and, furthermore, why the ‘solution’ on offer cannot solve the problem. The conclusion notes why the HR community should not avoid philosophical issues.  相似文献   

18.
Errors of measurement have long been recognized as a chronic problem in statistical analysis. Although there is a vast statistical literature of multiple regression models estimating the air pollution-mortality relationship, this problem has been largely ignored. It is well known that pollution measures contain error, but the consequences of this error for regression estimates is not known. We use Lave and Seskin's air pollution model to demonstrate the consequences of random measurement error. We assume a range of 0% to 50% of the variance of the pollution measures is due to error. We find large differences in the estimated effects on mortality of the pollution variables as well as the other explanatory variables once this measurement error is taken into account. These results cast doubt on the usual regression estimates of the mortality effects of air pollution. More generally our results demonstrate the consequences of random measurement error in the explanatory variable of a multiple regression analysis and the misleading conclusions that may result in policy research if this error is ignored.  相似文献   

19.
In industry sectors where market prices for goods and services are unavailable, it is common to use estimated output and input distance functions to estimate rates of productivity change. It is also possible, but less common, to use estimated distance functions to estimate the normalised support (or efficient) prices of individual inputs and outputs. A problem that arises in the econometric estimation of these functions is that more than one variable in the estimating equation may be endogenous. In such cases, maximum likelihood estimation can lead to biased and inconsistent parameter estimates. To solve the problem, we use linear programming to construct a quantity index. The distance function is then written in the form of a conventional stochastic frontier model where the explanatory variables are unambiguously exogenous. We use this approach to estimate productivity indexes, measures of environmental change, levels of efficiency, and support prices for a sample of Australian public hospitals.  相似文献   

20.
In many manufacturing and service industries, the quality department of the organization works continuously to ensure that the mean or location of the process is close to the target value. In order to understand the process, it is necessary to provide numerical statements of the processes that are being investigated. That is why the researcher needs to check the validity of the hypotheses that are concerned with some physical phenomena. It is usually assumed that the collected data behave well. However, sometimes the data may contain outliers. The presence of one or more outliers might seriously distort the statistical inference. Since the sample mean is very sensitive to outliers, this research will use the smooth adaptive (SA) estimator to estimate the population mean. The SA estimator will be used to construct testing procedures, called smooth adaptive test (SA test), for testing various null hypotheses. A Monte Carlo study is used to simulate the values of the probability of a Type I error and the power of the SA test. This is accomplished by constructing confidence intervals of the process mean by using the SA estimator and bootstrap methods. The SA test will be compared with other tests such as the normal test, t test and a nonparametric statistical method, namely, the Wilcoxon signed-rank test. Also, the cases with and without outliers will be considered. For the right-skewed distributions, the SA test is the best choice. When the population is a right-skewed distribution with one outlier, the SA test controls the probability of a Type I error better than other tests and is recommended.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号