首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
《Journal of econometrics》1987,34(3):275-291
The theory and application of ordinal qualitative dependent variable models have been given considerable attention in the social science and statistical literatures. Linear models with ordinal qualitative regressors have, however, been neglected. In this paper a simple specification for such models is developed and a consistent asymptotically normal estimator is offered. The estimator is compared to the conventional dummy variable approach using simulated data. The simulation results indicate that substantial gains with regard to bias and efficiency can be achieved relative to estimation using a conventional dummy variable scheme. Said gains appear to increase as the number of ordinal categories increases. The asymptotic properties of the estimator are detailed in the appendices.  相似文献   

2.
In the behavioral sciences, response variables are often non-continuous, ordinal variables. Conventional structural equation models (SEMs) have been generalized to accommodate ordinal responses. In this study, three different estimation methods on real data were performed with ordinal variables. Empirical results obtained from the different estimation methods on given real large sample educational data were investigated and compared to recent simulation results. As a result, even very large sample is available, model estimations and fits for ordinal data are affected from inconvenient estimation methods thus it is concluded that asymptotically distribution free estimation method specialized for ordinal variables is more convenient way to model ordinal variables.  相似文献   

3.
This article surveys various strategies for modeling ordered categorical (ordinal) response variables when the data have some type of clustering, extending a similar survey for binary data by Pendergast, Gange, Newton, Lindstrom, Palta & Fisher (1996). An important special case is when repeated measurement occurs at various occasions for each subject, such as in longitudinal studies. A much greater variety of models and fitting methods are available than when a similar survey for repeated ordinal response data was prepared a decade ago (Agresti, 1989). The primary emphasis of the review is on two classes of models, marginal models for which effects are averaged over all clusters at particular levels of predictors, and cluster-specific models for which effects apply at the cluster level. We present the two types of models in the ordinal context, review the literature for each, and discuss connections between them. Then, we summarize some alternative modeling approaches and ways of estimating parameters, including a Bayesian approach. We also discuss applications and areas likely to be popular for future research, such as ways of handling missing data and ways of modeling agreement and evaluating the accuracy of diagnostic tests. Finally, we review the current availability of software for using the methods discussed in this article.  相似文献   

4.
We consider the problems of estimation and testing in models with serially correlated discrete latent variables. A particular case of this is the time series regression model in which a discrete explanatory variable is measured with error. Test statistics are derived for detecting serial correlation in such a model. We then show that the likelihood function can be evaluated by a recurrence relation, and thus maximum likelihood estimation is computationally feasible. An illustrative example of these methods is given, followed by a brief discussion of their applicability to a Markov model of switching regressions.  相似文献   

5.
For cross-classification tables having an ordinal response variable, logit and probit models are formulated for the probability that a pair of subjects is concordant. For multidimensional tables, generalized models are given for the probability that the response at one setting of explanatory variables exceeds the response at another setting. Related measures of association are discussed for two-way tables.  相似文献   

6.
Ratio type financial indicators are the most popular explanatory variables in bankruptcy prediction models. These measures often exhibit heavily skewed distribution because of the presence of outliers. In the absence of clear definition of outliers, ad hoc approaches can be found in the literature for identifying and handling extreme values. However, it is not clear how these different approaches can affect the predictive power of models. There seems to be consensus in the literature on the necessity of handling outliers, at the same time, it is not clear how to define extreme values to be handled in order to maximize the predictive power of models. There are two possible ways to reduce the bias originating from outliers: omission and winsorization. Since the first approach has been examined previously in the literature, we turn our attention to the latter. We applied the most popular classification methodologies in this field: discriminant analysis, logistic regression, decision trees (CHAID and CART) and neural networks (multilayer perceptron). We assessed the predictive power of models in the framework of tenfold stratified crossvalidation and area under the ROC curve. We analyzed the effect of winsorization at 1, 3 and 5% and at 2 and 3 standard deviations, furthermore we discretized the range of each variable by the CHAID method and used the ordinal measures so obtained instead of the original financial ratios. We found that this latter data preprocessing approach is the most effective in the case of our dataset. In order to check the robustness of our results, we carried out the same empirical research on the publicly available Polish bankruptcy dataset from the UCI Machine Learning Repository. We obtained very similar results on both datasets, which indicates that the CHAID-based categorization of financial ratios is an effective way of handling outliers with respect to the predictive performance of bankruptcy prediction models.  相似文献   

7.
The aim of this paper is to convey to a wider audience of applied statisticians that nonparametric (matching) estimation methods can be a very convenient tool to overcome problems with endogenous control variables. In empirical research one is often interested in the causal effect of a variable X on some outcome variable Y . With observational data, i.e. in the absence of random assignment, the correlation between X and Y generally does not reflect the treatment effect but is confounded by differences in observed and unobserved characteristics. Econometricians often use two different approaches to overcome this problem of confounding by other characteristics. First, controlling for observed characteristics, often referred to as selection on observables, or instrumental variables regression, usually with additional control variables. Instrumental variables estimation is probably the most important estimator in applied work. In many applications, these control variables are themselves correlated with the error term, making ordinary least squares and two-stage least squares inconsistent. The usual solution is to search for additional instrumental variables for these endogenous control variables, which is often difficult. We argue that nonparametric methods help to reduce the number of instruments needed. In fact, we need only one instrument whereas with conventional approaches one may need two, three or even more instruments for consistency. Nonparametric matching estimators permit     consistent estimation without the need for (additional) instrumental variables and permit arbitrary functional forms and treatment effect heterogeneity.  相似文献   

8.
We discuss how to test the specification of an ordered discrete choice model against a general alternative. Two main approaches can be followed: tests based on moment conditions and tests based on comparisons between parametric and nonparametric estimations. Following these approaches, various statistics are proposed and their asymptotic properties are discussed. The performance of the statistics is compared by means of simulations. An easy-to-compute variant of the standard moment-based statistic yields the best results in models with a single explanatory variable. In models with various explanatory variables the results are less conclusive, since the relative performance of the statistics depends on both the fit of the model and the type of misspecification that is considered.  相似文献   

9.
Regression analyses of cross-country economic growth data are complicated by two main forms of model uncertainty: the uncertainty in selecting explanatory variables and the uncertainty in specifying the functional form of the regression function. Most discussions in the literature address these problems independently, yet a joint treatment is essential. We present a new framework that makes such a joint treatment possible, using flexible nonlinear models specified by Gaussian process priors and addressing the variable selection problem by means of Bayesian model averaging. Using this framework, we extend the linear model to allow for parameter heterogeneity of the type suggested by new growth theory, while taking into account the uncertainty in selecting explanatory variables. Controlling for variable selection uncertainty, we confirm the evidence in favor of parameter heterogeneity presented in several earlier studies. However, controlling for functional form uncertainty, we find that the effects of many of the explanatory variables identified in the literature are not robust across countries and variable selections.  相似文献   

10.
Penalized Regression with Ordinal Predictors   总被引:1,自引:0,他引:1  
Ordered categorial predictors are a common case in regression modelling. In contrast to the case of ordinal response variables, ordinal predictors have been largely neglected in the literature. In this paper, existing methods are reviewed and the use of penalized regression techniques is proposed. Based on dummy coding two types of penalization are explicitly developed; the first imposes a difference penalty, the second is a ridge type refitting procedure. Also a Bayesian motivation is provided. The concept is generalized to the case of non-normal outcomes within the framework of generalized linear models by applying penalized likelihood estimation. Simulation studies and real world data serve for illustration and to compare the approaches to methods often seen in practice, namely simple linear regression on the group labels and pure dummy coding. Especially the proposed difference penalty turns out to be highly competitive.  相似文献   

11.
The theory of estimation and inference in a very general class of latent variable models for time series is developed by showing that the distribution theory for the finite Fourier transform of the observable variables in latent variable models for time series is isomorphic to that for the observable variables themselves in classical latent variable models. This implies that analytic work on classical latent variable models can be adapted to latent variable models for time series, an implication which is illustrated here in the context of a general canonical form. To provide an empirical example a latent variable model for permanent income is developed, its parameters are shown to be identified, and a variety of restrictions on these parameters implied by the permanent income hypothesis are tested.  相似文献   

12.
The various approaches to the construction of causal models are compared from a probabilistic point of view. Although all methods are equivalent in the mathematical manipulation of the equations of a model, three distinct approaches are discernible, depending on how numerical values of the coefficients are calculated. All rely to a greater or lesser extent on a deterministic base, as a result of consideration of the equations simultaneously. The problems of polytomous (nominal and ordinal) variables, of omitted variables, and of nonlinearity are discussed and solutions proposed, before going on to investigate the uses of interaction effects in such models. The interpretation of interactions and relationship to paths and chains is discussed in detail. One step in the analysis of a model describing the relationships of student attitudes to home and to school environments is provided in detail to illustrate the probabilistic concepts. These results are compared with those which might have been obtained if a causal model based on path analysis with least squares linear regression analysis had been applied.  相似文献   

13.
This study examined the performance of two alternative estimation approaches in structural equation modeling for ordinal data under different levels of model misspecification, score skewness, sample size, and model size. Both approaches involve analyzing a polychoric correlation matrix as well as adjusting standard error estimates and model chi-squared, but one estimates model parameters with maximum likelihood and the other with robust weighted least-squared. Relative bias in parameter estimates and standard error estimates, Type I error rate, and empirical power of the model test, where appropriate, were evaluated through Monte Carlo simulations. These alternative approaches generally provided unbiased parameter estimates when the model was correctly specified. They also provided unbiased standard error estimates and adequate Type I error control in general unless sample size was small and the measured variables were moderately skewed. Differences between the methods in convergence problems and the evaluation criteria, especially under small sample and skewed variable conditions, were discussed.  相似文献   

14.
A common strategy within the framework of regression models is the selection of variables with possible predictive value, which are incorporated in the regression model. Two recently proposed methods, Breiman's Garotte (B reiman , 1995) and Tibshirani's Lasso (T ibshirani , 1996) try to combine variable selection and shrinkage. We compare these with pure variable selection and shrinkage procedures. We consider the backward elimination procedure as a typical variable selection procedure and as an example of a shrinkage procedure an approach of V an H ouwelingen and L e C essie (1990). Additionally an extension of van Houwelingens and le Cessies approach proposed by S auerbrei (1999) is considered. The ordinary least squares method is used as a reference.
With the help of a simulation study we compare these approaches with respect to the distribution of the complexity of the selected model, the distribution of the shrinkage factors, selection bias, the bias and variance of the effect estimates and the average prediction error.  相似文献   

15.
Presence of excess zero in ordinal data is pervasive in areas like medical and social sciences. Unfortunately, analysis of such kind of data has so far hardly been looked into, perhaps for the reason that the underlying model that fits such data, is not a generalized linear model. Obviously some methodological developments and intensive computations are required. The current investigation is concerned with the selection of variables in such models. In many occasions where the number of predictors is quite large and some of them are not useful, the maximum likelihood approach is not the automatic choice. As, apart from the messy calculations involved, this approach fails to provide efficient estimates of the underlying parameters. The proposed penalized approach includes ?1 penalty (LASSO) and the mixture of ?1 and ?2 penalties (elastic net). We propose a coordinate descent algorithm to fit a wide class of ordinal regression models and select useful variables appearing in both the ordinal regression and the logistic regression based mixing component. A rigorous discussion on the selection of predictors has been made through a simulation study. The proposed method is illustrated by analyzing the severity of driver injury from Michigan upper peninsula road accidents.  相似文献   

16.
《Socio》1986,20(3):155-160
Many of the linear goal programming algorithms that are available today are based on a simplex type solution method that begins with an initial simplex tableau whose solution set variables (i.e. basic variables) consist of all negative deviational variables or all positive deviational variables. Prior research has shown that computational solution effort can be reduced if the appropriate all negative or all positive deviational variable algorithm is selected. This paper presents a practical statistical screening procedure that can be used in conjunction with previously published selection criteria to reduce computational effort by selecting the appropriate algorithm for all types of applied goal programming models. Results of the study reveal the accuracy of the statistical screening procedure when it is applied to a large sample of goal programming problems.  相似文献   

17.
Our discussion is initiated as a response to the claim that sociologists should become “more historical” in their orientations. The issues are old, but every generation frames its own response. Our response is developed by appeal to intuitive convictions arising out of experience with mathematical models of social phenomena. We make a distinction between historical and sociological processes at a metaphysical level; that is, these two types of processes exemplify different categories of existence. Next we make this point of view concrete by using the idea of a model of social mobility as an example. The discussion then centers on problems related to the search for general laws. We frame a “fallacy of misplaced generality” and against this background discuss how the idea of scope conditions, used in conjunction with formal models, leads to a method for coping with the difficulties inherent in the effort to frame general sociological theories.  相似文献   

18.
Factor models have been applied extensively for forecasting when high‐dimensional datasets are available. In this case, the number of variables can be very large. For instance, usual dynamic factor models in central banks handle over 100 variables. However, there is a growing body of literature indicating that more variables do not necessarily lead to estimated factors with lower uncertainty or better forecasting results. This paper investigates the usefulness of partial least squares techniques that take into account the variable to be forecast when reducing the dimension of the problem from a large number of variables to a smaller number of factors. We propose different approaches of dynamic sparse partial least squares as a means of improving forecast efficiency by simultaneously taking into account the variable forecast while forming an informative subset of predictors, instead of using all the available ones to extract the factors. We use the well‐known Stock and Watson database to check the forecasting performance of our approach. The proposed dynamic sparse models show good performance in improving efficiency compared to widely used factor methods in macroeconomic forecasting. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

19.
Much Ado About Nothing: the Mixed Models Controversy Revisited   总被引:2,自引:2,他引:0  
We consider a well-known controversy that stems from the use of two mixed models for the analysis of balanced experimental data with a fixed and a random factor. It essentially originates in the different statistics developed from such models for testing that the variance parameter associated to the random factor is null. The corresponding hypotheses are interpreted as that of null random factor main effects in the presence of interaction. The controversy is further complicated by different opinions regarding the appropriateness of such hypothesis. Assuming that this is a sensible option, we show that the standard test statistics obtained under both models are really directed at different hypotheses and conclude that the problem lies in the definition of the main effects and interactions. We use expected values as in the fixed effects case to resolve the controversy showing that under the most commonly used model, the test usually associated to the inexistence of the random factor main effects addresses a different hypothesis. We discuss the choice of models, and some further problems that occur in the presence of unbalanced data.  相似文献   

20.
The paper reviews some old and new approaches to the analysis of linear models with errors in variables. The emphasis is on the identification problems that usually arise in errors–in–variables models and on the various types of additional information that econometricians have invoked to be able to estimate parameters consistently. The approaches discussed include instrumental variables, grouping, simultaneous equations, multiple equations and bounds on measurement error variances.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号