首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Rosen's ( 1974 ) theory of hedonic prices is implemented econometrically using recently developed nonparametric techniques to examine the influence of qualitative factors on the price of a house. Our ability to smooth categorical variables leads to greater generalization in the valuation process and provides a canvas for interactions between categorical and continuous variables that is difficult to exploit in parametric and semiparametric models. This is illustrated with a replication of a previously used partially linear model specification. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

2.
The paper attempts to make a clear distinction between three broad families of statistical indices: association, agreement, and what one may call equity. The need for this distinction arises in social research, for example, where reliability (accuracy, reproducibility, and stability) is assessed by measures of association rather than agreement. In this application, the assumptions built into an association measure conflict with the reality that gives rise to reliability data. A second motivation for this distinction is that association measures tend to express chance as the product of two potentially very different frequency distributions, agreement as the product of two identical distributions, and equity ignores such distributions altogether. A third motivation for this distinction is that the probability distribution of such measures does not depend on whether they are linear or non-linear, symmetrical or asymmetrical, or whether they express predictability or the extremality of a frequency distribution, but on their family membership. Notions of association, agreement, and equity have inherently nothing to do with the (nominal, ordinal, interval, and ratio) ordering in data. The 2-by-2 case is therefore chosen as the basis of the proposed distinction. All statistical indices, whether they are designed to characterise multivariate data or to identify complex orderings, ought to be applicable to this most reduced case of two variables, making one distinction in each. To test a coefficient's membership in one of the three families, nothing more complex is needed.  相似文献   

3.
Summary This paper reviews research situations in medicine, epidemiology and psychiatry, in psychological measurement and testing, and in sample surveys in which the observer(rater or interviewer) can be an important source of measurement error. Moreover, most of the statistical literature in observer variability is surveyed with attention given to a notational unification of the various models proposed. In the continuous data case, the usual analysis of variance (ANOVA) components of variance models are presented with an emphasis on the intraclass correlation coefficient as a measure of reliability. Other modified ANOVA models, response error models in sample surveys, and related multivariate extensions are also discussed. For the categorical data case, special attention is given to measures of agreement and tests of hypotheses when the data consist of dichotomous responses. In addition, similarities between the dichotomous and continous cases are illustrated in terms of intraclass correlation coefficients. Finally, measures of agreement, such as kappa and weighted-kappa, are discussed in the context of nominal and ordinal data. A proposed unifying framework for the categorical data case is given in the form of concluding remarks.  相似文献   

4.
Non-discretionary or environmental variables are regarded as important in the evaluation of efficiency in Data Envelopment Analysis (DEA), but there is no consensus on the correct treatment of these variables. This paper compares the performance of the standard BCC model as a base case with two single-stage models: the Banker and Morey (1986a) model, which incorporates continuous environmental variables and the Banker and Morey (1986b) model, which incorporates categorical environmental variables. Simulation analyses are conducted using a shifted Cobb-Douglas function, with one output, one non-discretionary input, and two discretionary inputs. The production function is constructed to separate environmental impact from managerial inefficiency, while providing measures of both for comparative purposes. Tests are performed to evaluate the accuracy of each model. The distribution of the inputs, the sample size and the number of categories for the categorical model are varied in the simulations to determine their impact on the performance of each model. The results show that the Banker and Morey models should be used in preference to the standard BCC model when the environmental impact is moderate to high. Both the continuous and categorical models perform equally well but the latter may be better suited to some applications with larger sample sizes. Even when the environmental impact is slight, the use of a simple two-way split of the sample data can produce significantly better results under the Categorical model in comparison to the BCC model.  相似文献   

5.
We propose the notion of multivariate predictability as a measure of goodness-of-fit in data reduction techniques which are useful for visualizing and screening data. For quantitative variables this leads to the usual sums-of-squares and variance accounted for criteria. For categorical variables we show how to predict the category-levels of all variables associated with every point (case). The proportion of predictions which agree with the true categories gives the measure of fit. The ideas are very general; as an illustration we use nonlinear principal components analysis (NLPCA) in association with ordered categorical variables. A detailed example using data from the International Social Survey Program (ISSP) will be given in Blasius and Gower (quality and quantity, 39, to appear). It will be shown that the predictability criterion suggests that the fits are rather better than is indicated by “percentage of variance accounted for”.This article was written while John Gower was a visiting professor at the ZA-Eurolab, at the Zentralarchiv für Empirische Sozialforschung, University of Cologne, Germany. The ZA is a Large Scale Facility funded by the Training and Mobility of Researchers program of the European Union.  相似文献   

6.
The paper discusses a small image study in which seven assessors judge nine brands of coffee in terms of six quantitative variables and five categorical variables. Generalised Procrustes Analysis and Generalised Biplots are combined to display simultaneously information on the brands and on both quantitative and categorical variables. An outline is given of the methodology.  相似文献   

7.
This paper develops a probabilistic clustering model for mixeddata. The model allows analysis of variables of mixed type: thevariables may be nominal, ordinal and/or quantitative. The modelcontains the well-known models of latent class analysis as submodels.As in latent class analysis, local independence of the variables isassumed. The parameters of the model are estimated by the EMalgorithm. Test statistics and goodness-of-fit measures are proposedfor model selection. Two artificial data sets show the usefulness ofthese tests. An empirical example completes the presentation.  相似文献   

8.
In ridit analysis scale values are assigned to the categories of an ordinal response variable, the ridits. Ridit analysis is described as a technique to compare groups of observations and to measure the association between ordinal and nominal variables by mean ridits. By their definition mean ridits are closely related to distribution-free methods. Mean ridits are also used as a tool to test the main and/or interaction effects of factors on an ordinal response variable.  相似文献   

9.
The past forty years have seen a great deal of research into the construction and properties of nonparametric estimates of smooth functions. This research has focused primarily on two sides of the smoothing problem: nonparametric regression and density estimation. Theoretical results for these two situations are similar, and multivariate density estimation was an early justification for the Nadaraya-Watson kernel regression estimator.
A third, less well-explored, strand of applications of smoothing is to the estimation of probabilities in categorical data. In this paper the position of categorical data smoothing as a bridge between nonparametric regression and density estimation is explored. Nonparametric regression provides a paradigm for the construction of effective categorical smoothing estimates, and use of an appropriate likelihood function yields cell probability estimates with many desirable properties. Such estimates can be used to construct regression estimates when one or more of the categorical variables are viewed as response variables. They also lead naturally to the construction of well-behaved density estimates using local or penalized likelihood estimation, which can then be used in a regression context. Several real data sets are used to illustrate these points.  相似文献   

10.
For an analysis of the association between two categorical variables that are cross-classified to form a contingency table, graphical procedures have been central to this analysis. In particular, correspondence analysis has grown to be a popular method for obtaining such a summary and there is a great variety of different approaches that one may consider to perform. In this paper, we shall introduce a simple algebraic generalisation of some of the more common approaches to obtaining a graphical summary of association, where these approaches are akin to the correspondence analysis of a two-way contingency table. Specific cases of the generalised procedure include the classical and non-symmetrical correspondence plots and the symmetrical and isometric biplots.  相似文献   

11.
The paper proposes a general framework for modeling multiple categorical latent variables (MCLV). The MCLV models extend latent class analysis or latent transition analysis to allow flexible measurement and structural components between endogenous categorical latent variables and exogenous covariates. Therefore, modeling frameworks in conventional structural equation models, for example, CFA and MIMIC models are feasible in the MCLV circumstances. Parameter estimations for the MCLV models are performed by using generalized expectation–maximization (E–M) algorithm. In addition, the adjusted Bayesian information criterion provides help for model selections. A substantive study of reading development is analyzed to illustrate the feasibility of MCLV models.  相似文献   

12.
Gower and Blasius (Quality and Quantity, 39, 2005) proposed the notion of multivariate predictability as a measure of goodness-of-fit in data reduction techniques which is useful for visualizing and screening data. For quantitative variables this leads to the usual sums-of-squares and variance accounted for criteria. For categorical variables, and in particular for ordered categorical variables, they showed how to predict the levels of all variables associated with every point (case). The proportion of predictions which agree with the true category-levels gives the measure of fit. The ideas are very general; as an illustration they used nonlinear principal components analysis. An example of the method is described in this paper using data drawn from 23 countries participating in the International Social Survey Program (1995), paying special attention to two sets of variables concerned with Regional and National Identity. It turns out that the predictability criterion suggests that the fits are rather better than is indicated by “percentage of variance accounted for”.  相似文献   

13.
The various approaches to the construction of causal models are compared from a probabilistic point of view. Although all methods are equivalent in the mathematical manipulation of the equations of a model, three distinct approaches are discernible, depending on how numerical values of the coefficients are calculated. All rely to a greater or lesser extent on a deterministic base, as a result of consideration of the equations simultaneously. The problems of polytomous (nominal and ordinal) variables, of omitted variables, and of nonlinearity are discussed and solutions proposed, before going on to investigate the uses of interaction effects in such models. The interpretation of interactions and relationship to paths and chains is discussed in detail. One step in the analysis of a model describing the relationships of student attitudes to home and to school environments is provided in detail to illustrate the probabilistic concepts. These results are compared with those which might have been obtained if a causal model based on path analysis with least squares linear regression analysis had been applied.  相似文献   

14.
In spite of the abundance of clustering techniques and algorithms, clustering mixed interval (continuous) and categorical (nominal and/or ordinal) scale data remain a challenging problem. In order to identify the most effective approaches for clustering mixed‐type data, we use both theoretical and empirical analyses to present a critical review of the strengths and weaknesses of the methods identified in the literature. Guidelines on approaches to use under different scenarios are provided, along with potential directions for future research.  相似文献   

15.
Filter questions with skip patterns have been widely used in survey research, and latent class models (LCM) are often used to analyze this type of categorical data. The LCM parameters are usually estimated by means of an EM (expectation maximization) algorithm. When the pattern is present, the non-response of the skip pattern cannot be treated as random missingness. We thus propose a modified algorithm to estimate the latent class parameters when non-response is present, and the approach is attractive for two reasons. First, the latent class model with the algorithm is very flexible in the sense that it can model the association of variables with the skip patterns under study. Secondly, the algorithm can be easily implemented using any computer language. An empirical example is used to demonstrate the usefulness of the algorithm. The algorithm may also be flexibly generalized to more complex surveys, for example, polytomous responses.  相似文献   

16.
It has been argued that volatility in nominal macroeconomic aggregates has had a negative effect on real output, in particular that such volatility contributed to slow output growth in the early 1980s. This paper reexamines the effects of volatility in nominal macroeconomic aggregates in the context of a modern simultaneous equation framework where the volatility of, nominal macroeconomic variables is modeled as the conditional variance of two variables of interest: the federal funds rate and inflation. The empirical framework is the recently developed multivariate GARCH-in-mean vector autoregressive model. We confirm evidence that inflation volatility and tight monetary policy have directly affected output growth, but find that volatility in the federal funds rate has not.  相似文献   

17.
The collapsibility theorem describes both the circumstances in which the effects of hierarchical models change when additional variables are introduced, as the circumstances in which the exclusion of certain variables and the analysis of specific marginal tables may lead to different conclusions. The partial association model is here considered as a specific example of three-dimensional log-linear analysis. Collapsibility is examined in an empirical study currently being performed in Catalonia with regard to program evaluation in penitentiary centers.  相似文献   

18.
I present a new approach to the study of causality in social theory using linguistic fuzzy logic as a framework. This approach differs from conventional analysis of causality on two fronts. First, all variables are considered to possess two degrees of freedom (or variation): a linguistic nuance value, which corresponds to what we conventionally refer to as interval or categorical value, and a linguistic truth value, which measures our confidence level in this nuance value. Second, combining this double fuzzification of variables with linguistic fuzzy logic I propose new tools for studying fuzzy causality. The linguistic fuzzy logic approach is illustrated through a re-examination of Skocpol’s (1979, States and social revolutions: a comparative analysis of France, Russia, and China. Cambridge University Press, Cambridge) theory of social revolution.  相似文献   

19.
The proportional odds model is the most widely used model when the response has ordered categories. In the case of high‐dimensional predictor structure, the common maximum likelihood approach typically fails when all predictors are included. A boosting technique pomBoost is proposed to fit the model by implicitly selecting the influential predictors. The approach distinguishes between metric and categorical predictors. In the case of categorical predictors, where each predictor relates to a set of parameters, the objective is to select simultaneously all the associated parameters. In addition, the approach distinguishes between nominal and ordinal predictors. In the case of ordinal predictors, the proposed technique uses the ordering of the ordinal predictors by penalizing the difference between the parameters of adjacent categories. The technique has also a provision to consider some mandatory predictors (if any) that must be part of the final sparse model. The performance of the proposed boosting algorithm is evaluated in a simulation study and applications with respect to mean squared error and prediction error. Hit rates and false alarm rates are used to judge the performance of pomBoost for selection of the relevant predictors.  相似文献   

20.
Evidence to support the Gibson paradox is often given in the form of a simple correlation between the nominal interest rate and the log of price level, or in the form of a simple linear regression between these two variables. Authors then show, using standard procedures of statistical inference, that the price level possesses a significant coefficient. We argue that this class of evidence is spurious since the nominal interest rate and the price level (both integrated variables) do not form a cointegrated system.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号