期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Do missing values exist? Incomplete data handling in cross-national longitudinal studies by means of continuous time modeling

Johan H. L. Oud Manuel C. Voelkle 《Quality and Quantity》2014,48(6):3271-3288

In cross-national longitudinal studies it is often impossible to administer the same measurement instruments at the same occasions to all sample units in all participating countries. This quickly results in large quantities of missing data, due to (a) missing measurement instruments in some countries, (b) missing assessment waves within or across countries, (c) missing data for individual sample units. As compared to cross-sectional studies, the problem of missing values is further aggravated by the fact that missing values are always associated with different time intervals between repeated observations. In the past, this has often been dealt with by the use of phantom-variables, but this approach is limited to simple designs with few missing value patters. In the present paper we propose a new way to think of, and deal with, missing values in longitudinal studies. Instead of conceiving of a longitudinal study as a study with \(T\) discrete time points of which some are missing, we propose to conceive of a longitudinal study as a way to measure an underlying process that develops continuously over time, but is only observed at some selected discrete time points. This transforms the problem of missing values into a problem of unequal time intervals. After a quick introduction to the basic idea of continuous time modeling, we demonstrate how this approach provides a straightforward solution to missing measurement instruments in some countries, missing assessment waves within or across countries, and missing data for individual sample units. 相似文献

2.

Measuring the impact of nonignorability in panel data with non‐monotone nonresponse

Hui Xie Dr Yi Qian 《Journal of Applied Econometrics》2012,27(1):129-159

The analysis of panel data with non‐monotone nonresponse often relies on the critical and untestable assumption of ignorable missingness. It is important to assess the consequences of departures from the ignorability assumption. Non‐monotone nonresponse, however, can often make such sensitivity analysis infeasible because the likelihood functions for alternative models involve high‐dimensional and difficult‐to‐evaluate integrals with respect to missing outcomes. We develop an extension of the local sensitivity method that overcomes computational difficulty and completely avoids fitting alternative models and evaluating these high‐dimensional integrals. The proposed method is applicable to a wide range of panel outcomes. We apply the method to a Smoking Trend dataset where we relax the standard ignorability assumption and evaluate how smoking‐trend estimates in different groups of US young adults are affected by alternative assumptions about the missing‐data mechanism. The main finding is that the standard estimate in the black male group is sensitive to nonignorable missingness but those in other groups are reasonably robust. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献

3.

A two-step approach to account for unobserved spatial heterogeneity

Anna Gloria Billé Roberto Benedetti Paolo Postiglione 《Spatial Economic Analysis》2017,12(4):452-471

A two-step approach to account for unobserved spatial heterogeneity. Spatial Economic Analysis. Empirical analysis in economics often faces the difficulty that the data are correlated and heterogeneous in some unknown form. Spatial econometric models have been widely used to account for dependence structures, but the problem of directly dealing with unobserved spatial heterogeneity has been largely unexplored. The problem can be serious particularly if we have no prior information justified by economic theory. In this paper we propose a two-step procedure to identify endogenously spatial regimes in the first step and to account for spatial dependence in the second step. This procedure is applied to hedonic house price analysis. 相似文献

4.

Standardization of Variables and Collinearity Diagnostic in Ridge Regression

下载免费PDF全文

José García Román Salmerón Catalina García María del Mar López Martín 《Revue internationale de statistique》2016,84(2):245-266

Ridge estimation (RE) is an alternative method to ordinary least squares when there exists a collinearity problem in a linear regression model. The variance inflator factor (VIF) is applied to test if the problem exists in the original model and is also necessary after applying the ridge estimate to check if the chosen value for parameter k has mitigated the collinearity problem. This paper shows that the application of the original data when working with the ridge estimate leads to non‐monotone VIF values. García et al. (2014) showed some problems with the traditional VIF used in RE. We propose an augmented VIF, VIF_R(j,k), associated with RE, which is obtained by standardizing the data before augmenting the model. The VIF_R(j,k) will coincide with the VIF associated with the ordinary least squares estimator when k = 0. The augmented VIF has the very desirable properties of being continuous, monotone in the ridge parameter and higher than one. 相似文献

5.

Inference with dependent data using cluster covariance estimators

C. Alan Bester Timothy G. Conley Christian B. Hansen 《Journal of econometrics》2011,165(2):137-151

This paper presents an inference approach for dependent data in time series, spatial, and panel data applications. The method involves constructing t and Wald statistics using a cluster covariance matrix estimator (CCE). We use an approximation that takes the number of clusters/groups as fixed and the number of observations per group to be large. The resulting limiting distributions of the t and Wald statistics are standard t and F distributions where the number of groups plays the role of sample size. Using a small number of groups is analogous to ‘fixed-b’ asymptotics of [Kiefer and Vogelsang, 2002] and [Kiefer and Vogelsang, 2005] (KV) for heteroskedasticity and autocorrelation consistent inference. We provide simulation evidence that demonstrates that the procedure substantially outperforms conventional inference procedures. 相似文献

6.

Generalized linear mixed models with informative dropouts and missing covariates

Kunling Wu Lang Wu 《Metrika》2007,66(1):1-18

Generalized linear mixed models (GLMM) are useful in many longitudinal data analyses. In the presence of informative dropouts and missing covariates, however, standard complete-data methods may not be applicable. In this article, we consider a likelihood method and an approximate method for GLMM with informative dropouts and missing covariates. The methods are implemented by Monte–Carlo EM algorithms combined with Gibbs sampler. The approximate method may lead to inconsistent estimators but is computationally more efficient than the likelihood method. The two methods are evaluated via a simulation study for longitudinal binary data, and appear to perform reasonably well. A dataset on mental distress is analyzed in details. 相似文献

7.

k‐Nearest neighbors local linear regression for functional and missing data at random

Mustapha Rachdi Ali Laksaci Zoulikha Kaid Abbassia Benchiha Fahimah A. Al‐Awadhi 《Statistica Neerlandica》2021,75(1):42-65

We combine the k‐Nearest Neighbors (kNN) method to the local linear estimation (LLE) approach to construct a new estimator (LLE‐kNN) of the regression operator when the regressor is of functional type and the response variable is a scalar but observed with some missing at random (MAR) observations. The resulting estimator inherits many of the advantages of both approaches (kNN and LLE methods). This is confirmed by the established asymptotic results, in terms of the pointwise and uniform almost complete consistencies, and the precise convergence rates. In addition, a numerical study (i) on simulated data, then (ii) on a real dataset concerning the sugar quality using fluorescence data, were conducted. This practical study clearly shows the feasibility and the superiority of the LLE‐kNN estimator compared to competitive estimators. 相似文献

8.

The development of delinquency during adolescence: a comparison of missing data techniques

Jost Reinecke Cornelia Weins 《Quality and Quantity》2013,47(6):3319-3334

Conclusions on the development of delinquent behaviour during the life-course can only be made with longitudinal data, which is regularly gained by repeated interviews of the same respondents. Missing data are a problem for the analysis of delinquent behaviour during the life-course shown with data from an adolescents’ four-wave panel. In this article two alternative techniques to cope with missing data are used: full information maximum likelihood estimation and multiple imputation. Both methods allow one to consider all available data (including adolescents with missing information on some variables) in order to estimate the development of delinquency. We demonstrate that self-reported delinquency is systematically underestimated with listwise deletion (LD) of missing data. Further, LD results in false conclusions on gender and school specific differences of the age–crime relationship. In the final discussion some hints are given for further methods to deal with bias in panel data affected by the missing process. 相似文献

9.

Longitudinal LISREL model estimation from incomplete panel data using the EM algorithm and the Kalman smoother

R. A. R. G. Jansen J. H. L. Oud 《Statistica Neerlandica》1995,49(3):362-377

Longitudinal data sets with the structure T (time points) × N (subjects) are often incomplete because of data missing for certain subjects at certain time points. The EM algorithm is applied in conjunction with the Kalman smoother for computing maximum likelihood estimates of longitudinal LISREL models from varying missing data patterns. The iterative procedure uses the LISREL program in the M-step and the Kalman smoother in the E-step. The application of the method is illustrated by simulating missing data on a data set from educational research. 相似文献

10.

Semiparametric estimation of logistic regression model with missing covariates and outcome

Shen-Ming Lee Chin-Shang Li Shu-Hui Hsieh Li-Hui Huang 《Metrika》2012,75(5):621-653

We consider a semiparametric method to estimate logistic regression models with missing both covariates and an outcome variable, and propose two new estimators. The first, which is based solely on the validation set, is an extension of the validation likelihood estimator of Breslow and Cain (Biometrika 75:11–20, 1988). The second is a joint conditional likelihood estimator based on the validation and non-validation data sets. Both estimators are semiparametric as they do not require any model assumptions regarding the missing data mechanism nor the specification of the conditional distribution of the missing covariates given the observed covariates. The asymptotic distribution theory is developed under the assumption that all covariate variables are categorical. The finite-sample properties of the proposed estimators are investigated through simulation studies showing that the joint conditional likelihood estimator is the most efficient. A cable TV survey data set from Taiwan is used to illustrate the practical use of the proposed methodology. 相似文献

11.

Bootstrap J-Test for Panel Data Models with Spatially Dependent Error Components,a Spatial Lag and Additional Endogenous Variables

Bernard Fingleton 《Spatial Economic Analysis》2016,11(1):7-26

We develop a bootstrap J-test method for testing a panel model against one non-nested alternative when the competing specifications are estimated by Feasible Generalised Spatial Two Stage Least Squares/Generalised Method of Moments (FGS2SLS/GMM). Both models incorporate spatially correlated error components, thus accounting for spatial heterogeneity via random effects, and accommodate endogenous regressors other than the spatially lagged dependent variable. The proposed scheme is applied to a testing problem involving non-nested wage equations as motivated by the Wage Curve literature and the New Economic Geography theory. Results show that our bootstrap test is a reliable and effective procedure for correcting asymptotic reference critical values and distinguishing between the two rival hypotheses. 相似文献

12.

An algorithm for panel ANOVA with grouped data

Carmen Anido Carlos Rivero Teofilo Valdes 《Metrika》2011,74(1):85-107

In this paper, we present an algorithm suitable for analysing the variance of panel data when some observations are either given in grouped form or are missed. The analysis is carried out from the perspective of ANOVA panel data models with general errors. The classification intervals of the grouped observations may vary from one to another, thus the missing observations are in fact a particular case of grouping. The proposed Algorithm (1) estimates the parameters of the panel data models; (2) evaluates the covariance matrices of the asymptotic distribution of the time-dependent parameters assuming that the number of time periods, T, is fixed and the number of individuals, N, tends to infinity and similarly, of the individual parameters when T → ∞ and N is fixed; and, finally, (3) uses these asymptotic covariance matrix estimations to analyse the variance of the panel data. 相似文献

13.

Imputation of Missing Item Responses: Some Simple Techniques

Huisman Mark 《Quality and Quantity》2000,34(4):331-351

Among the wide variety of procedures to handle missing data, imputingthe missing values is a popular strategy to deal with missing itemresponses. In this paper some simple and easily implemented imputationtechniques like item and person mean substitution, and somehot-deck procedures, are investigated. A simulation study was performed based on responses to items forming a scale to measure a latent trait ofthe respondents. The effects of different imputation procedures onthe estimation of the latent ability of the respondents wereinvestigated, as well as the effect on the estimation of Cronbach'salpha (indicating the reliability of the test) and Loevinger'sH-coefficient (indicating scalability). The results indicate thatprocedures which use the relationships between items perform best,although they tend to overestimate the scale quality. 相似文献

14.

Robust analysis of longitudinal data with nonignorable missing responses

Sanjoy K. Sinha 《Metrika》2012,75(7):913-938

We encounter missing data in many longitudinal studies. When the missing data are nonignorable, it is important to analyze the data by incorporating the missing data mechanism into the observed data likelihood function. The classical maximum likelihood (ML) method for analyzing longitudinal missing data has been extensively studied in the literature. However, it is well-known that the ordinary ML estimators are sensitive to extreme observations or outliers in the data. In this paper, we propose and explore a robust method, which is developed in the framework of the ML method, and is useful for downweighting any influential observations in the data when estimating the model parameters. We study the empirical properties of the robust estimators in small simulations. We also illustrate the robust method using incomplete longitudinal data on CD4 counts from clinical trials of HIV-infected patients. 相似文献

15.

Estimating efficiencies from frontier models with panel data: A comparison of parametric,non-parametric and semi-parametric methods with bootstrapping

Léopold Simar 《Journal of Productivity Analysis》1992,3(1-2):171-203

The aim of this article is first to review how the standard econometric methods for panel data may be adapted to the problem of estimating frontier models and (in)efficiencies. The aim is to clarify the difference between the fixed and random effect model and to stress the advantages of the latter. Then a semi-parametric method is proposed (using a non-parametric method as a first step), the message being that in order to estimate frontier models and (in)efficiences with panel data, it is an appealing method. Since analytic sampling distributions of efficiencies are not available, a bootstrap method is presented in this framework. This provides a tool allowing to assess the statistical significance of the obtained estimators. All the methods are illustrated in the problem of estimating the inefficiencies of 19 railway companies observed over a period of 14 years (1970–1983).Article presented at the ORSA/TIMS joint national meeting, Productivity and Global Competition, Philadelphia, October 29–31, 1990. An earlier version of the paper was presented at the European Workshop on Efficiency and Productivity Measurement in the Service Industries held at CORE, October 20–21, 1989. Helpful comments of Jacques Mairesse, Benoît Mulkay, Sergio Perelman, Michel Mouchart, Shawna Grosskopf and Rolf Färe, at various stages of the paper, are gratefully acknowledged. 相似文献

16.

An empirical likelihood method for spatial regression

Daniel J. Nordman 《Metrika》2008,68(3):351-363

Properties of a “blockwise”empirical likelihood for spatial regression with non-stochastic regressors are investigated for spatial data on a lattice. The method enables nonparametric confidence regions for spatial trend parameters to be calibrated, even though non-random regressors introduce non-stationary forms of spatial dependence into the “blockwise” construction. Additionally, the regression results are valid in a general framework allowing for a variety of behavior in regressor variables as well as the underlying spatial error process. The same regression method also applies when the regressors are stochastic. 相似文献

17.

Handling Missing Data by Re-approaching Non-respondents

Huisman Mark Krol Boudien Van Sonderen Eric 《Quality and Quantity》1998,32(1):77-91

When handling missing data, a researcher should be aware of the mechanism underlying the missingness. In the presence of non-randomly missing data, a model of the missing data mechanism should be included in the analyses to prevent the analyses based on the data from becoming biased. Modeling the missing data mechanism, however, is a difficult task. One way in which knowledge about the missing data mechanism may be obtained is by collecting additional data from non-respondents. In this paper the method of re-approaching respondents who did not answer all questions of a questionnaire is described. New answers were obtained from a sample of these non-respondents and the reason(s) for skipping questions was (were) probed for. The additional data resulted in a larger sample and was used to investigate the differences between respondents and non-respondents, whereas probing for the causes of missingness resulted in more knowledge about the nature of the missing data patterns. 相似文献

18.

Taking ‘Don’t Knows’ as Valid Responses: A Multiple Complete Random Imputation of Missing Data

Martin Kroh 《Quality and Quantity》2006,40(2):225-244

Incomplete data is a common problem of survey research. Recent work on multiple imputation techniques has increased analysts’ awareness of the biasing effects of missing data and has also provided a convenient solution. Imputation methods replace non-response with estimates of the unobserved scores. In many instances, however, non-response to a stimulus does not result from measurement problems that inhibit accurate surveying of empirical reality, but from the inapplicability of the survey question. In such cases, existing imputation techniques replace valid non-response with counterfactual estimates of a situation in which the stimulus is applicable to all respondents. This paper suggests an alternative imputation procedure for incomplete data for which no true score exists: multiple complete random imputation, which overcomes the biasing effects of missing data and allows analysts to model respondents’ valid ‘I don’t know’ answers. 相似文献

19.

Missing Data: A Unified Taxonomy Guided by Conditional Independence

下载免费PDF全文

Marco Doretti Sara Geneletti Elena Stanghellini 《Revue internationale de statistique》2018,86(2):189-204

Recent work (Seaman et al., 2013 ; Mealli & Rubin, 2015 ) attempts to clarify the not always well‐understood difference between realised and everywhere definitions of missing at random (MAR) and missing completely at random. Another branch of the literature (Mohan et al., 2013 ; Pearl & Mohan, 2013 ) exploits always‐observed covariates to give variable‐based definitions of MAR and missing completely at random. In this paper, we develop a unified taxonomy encompassing all approaches. In this taxonomy, the new concept of ‘complementary MAR’ is introduced, and its relationship with the concept of data observed at random is discussed. All relationships among these definitions are analysed and represented graphically. Conditional independence, both at the random variable and at the event level, is the formal language we adopt to connect all these definitions. Our paper covers both the univariate and the multivariate case, where attention is paid to monotone missingness and to the concept of sequential MAR. Specifically, for monotone missingness, we propose a sequential MAR definition that might be more appropriate than both everywhere and variable‐based MAR to model dropout in certain contexts. 相似文献

20.

Hedonic Housing Prices in Paris: An Unbalanced Spatial Lag Pseudo‐Panel Model with Nested Random Effects

下载免费PDF全文

Badi H. Baltagi Georges Bresson Jean‐Michel Etienne 《Journal of Applied Econometrics》2015,30(3):509-528

This paper estimates a hedonic housing model based on flats sold in the city of Paris over the period 1990–2003. This is done using maximum likelihood estimation, taking into account the nested structure of the data. Paris is historically divided into 20 arrondissements, each divided into four quartiers (quarters), which in turn contain between 15 and 169 blocks (îlot, in French) per quartier. This is an unbalanced pseudo?panel data containing 156,896 transactions. Despite the richness of the data, many neighborhood characteristics are not observed, and we attempt to capture these neighborhood spillover effects using a spatial lag model. Using likelihood ratio tests, we find significant spatial lag effects as well as significant nested random error effects. The empirical results show that the hedonic housing estimates and the corresponding marginal effects are affected by taking into account the nested aspects of the Paris housing data as well as the spatial neighborhood effects.Copyright © 2014 John Wiley & Sons, Ltd. 相似文献