共查询到6条相似文献,搜索用时 0 毫秒
1.
The missing data problem has been widely addressed in the literature. The traditional methods for handling missing data may be not suited to spatial data, which can exhibit distinctive structures of dependence and/or heterogeneity. As a possible solution to the spatial missing data problem, this paper proposes an approach that combines the Bayesian Interpolation method [Benedetti, R. & Palma, D. (1994) Markov random field-based image subsampling method, Journal of Applied Statistics, 21(5), 495–509] with a multiple imputation procedure. The method is developed in a univariate and a multivariate framework, and its performance is evaluated through an empirical illustration based on data related to labour productivity in European regions. 相似文献
2.
Jaap P.L. Brand Stef van Buuren Karin Groothuis-Oudshoorn Edzard S. Gelsema† 《Statistica Neerlandica》2003,57(1):36-45
This paper outlines a strategy to validate multiple imputation methods. Rubin's criteria for proper multiple imputation are the point of departure. We describe a simulation method that yields insight into various aspects of bias and efficiency of the imputation process. We propose a new method for creating incomplete data under a general Missing At Random (MAR) mechanism. Software implementing the validation strategy is available as a SAS/IML module. The method is applied to investigate the behavior of polytomous regression imputation for categorical data. 相似文献
3.
This paper discusses the importance of managing data quality in academic research in its relation to satisfying the customer. This focus is on the data completeness objectivedimension of data quality in relation to recent advancements which have been made in the development of methods for analysing incomplete multivariate data. An overview and comparison of the traditional techniques with the recent advancements are provided. Multiple imputation is also discussed as a method of analysing incomplete multivariate data, which can potentially reduce some of the biases which can occur from using some of the traditional techniques. Despite these recent advancements in the analysis of incomplete multivariate data, evidence is presented which shows that researchers are not using these techniques to manage the data quality of their current research across a variety of academic disciplines. An analysis is then provided as to why these techniques have not been adopted along with suggestions to improve the frequency of their use in the future.
Source-Reference. The ideas for this paper originated from research work on David J. Fogarty's Ph.D. dissertation. The subject area is the use of advanced techniques for the imputation of incomplete multivariate data on corporate data warehouses. 相似文献
4.
Hot deck imputation is a method for handling missing data in which each missing value is replaced with an observed response from a similar unit. Despite being used extensively in practice, the theory is not as well developed as that of other imputation methods. We have found that no consensus exists as to the best way to apply the hot deck and obtain inferences from the completed data set. Here we review different forms of the hot deck and existing research on its statistical properties. We describe applications of the hot deck currently in use, including the U.S. Census Bureau's hot deck for the Current Population Survey (CPS). We also provide an extended example of variations of the hot deck applied to the third National Health and Nutrition Examination Survey (NHANES III). Some potential areas for future research are highlighted. 相似文献
5.
In missing data problems, it is often the case that there is a natural test statistic for testing a statistical hypothesis had all the data been observed. A fuzzy p -value approach to hypothesis testing has recently been proposed which is implemented by imputing the missing values in the complete data test statistic by values simulated from the conditional null distribution given the observed data. We argue that imputing data in this way will inevitably lead to loss in power. For the case of scalar parameter, we show that the asymptotic efficiency of the score test based on the imputed complete data relative to the score test based on the observed data is given by the ratio of the observed data information to the complete data information. Three examples involving probit regression, normal random effects model, and unidentified paired data are used for illustration. For testing linkage disequilibrium based on pooled genotype data, simulation results show that the imputed Neyman Pearson and Fisher exact tests are less powerful than a Wald-type test based on the observed data maximum likelihood estimator. In conclusion, we caution against the routine use of the fuzzy p -value approach in latent variable or missing data problems and suggest some viable alternatives. 相似文献
6.
Martin Kroh 《Quality and Quantity》2006,40(2):225-244
Incomplete data is a common problem of survey research. Recent work on multiple imputation techniques has increased analysts’
awareness of the biasing effects of missing data and has also provided a convenient solution. Imputation methods replace non-response
with estimates of the unobserved scores. In many instances, however, non-response to a stimulus does not result from measurement
problems that inhibit accurate surveying of empirical reality, but from the inapplicability of the survey question. In such
cases, existing imputation techniques replace valid non-response with counterfactual estimates of a situation in which the
stimulus is applicable to all respondents. This paper suggests an alternative imputation procedure for incomplete data for
which no true score exists: multiple complete random imputation, which overcomes the biasing effects of missing data and allows
analysts to model respondents’ valid ‘I don’t know’ answers. 相似文献