首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
Multiple imputation has become viewed as a general solution to missing data problems in statistics. However, in order to lead to consistent asymptotically normal estimators, correct variance estimators and valid tests, the imputations must be proper . So far it seems that only Bayesian multiple imputation, i.e. using a Bayesian predictive distribution to generate the imputations, or approximately Bayesian multiple imputations has been shown to lead to proper imputations in some settings. In this paper, we shall see that Bayesian multiple imputation does not generally lead to proper multiple imputations. Furthermore, it will be argued that for general statistical use, Bayesian multiple imputation is inefficient even when it is proper.  相似文献   

2.
In this review paper, we discuss the theoretical background of multiple imputation, describe how to build an imputation model and how to create proper imputations. We also present the rules for making repeated imputation inferences. Three widely used multiple imputation methods, the propensity score method, the predictive model method and the Markov chain Monte Carlo (MCMC) method, are presented and discussed.  相似文献   

3.
基于EMB多重插补法的线性模型系数估计量,分析其统计性质,并与PMM多重插补法以及DA插补法进行比较。模拟结果显示,随着无回答率增加,系数估计量的偏差绝对值、均方误差呈递增趋势,估计方差的递增趋势相对更显著。在完全随机无回答机制或随机无回答机制下,建议插补重数为15。在依赖被解释变量的非随机无回答机制下,建议插补重数可适当增大。在依赖其他变量的非随机无回答机制下,估计量的均方误差和估计方差的差异大,使用EMB多重插补法要谨慎。  相似文献   

4.
Imputation: Methods, Simulation Experiments and Practical Examples   总被引:1,自引:0,他引:1  
When conducting surveys, two kinds of nonresponse may cause incomplete data files: unit nonresponse (complete nonresponse) and item nonresponse (partial nonresponse). The selectivity of the unit nonresponse is often corrected for. Various imputation techniques can be used for the missing values because of item nonresponse. Several of these imputation techniques are discussed in this report. One is the hot deck imputation. This paper describes two simulation experiments of the hot deck method. In the first study, data are randomly generated, and various percentages of missing values are then non-randomly'added'to the data. The hot deck method is used to reconstruct the data in this Monte Carlo experiment. The performance of the method is evaluated for the means, standard deviations, and correlation coefficients and compared with the available case method. In the second study, the quality of an imputation method is studied by running a simulation experiment. A selection of the data of the Dutch Housing Demand Survey is perturbed by leaving out specific values on a variable. Again hot deck imputations are used to reconstruct the data. The imputations are then compared with the true values. In both experiments the conclusion is that the hot deck method generally performs better than the available case method. This paper also deals with the questions which variables should be imputed and what the duration of the imputation process is. Finally the theory is illustrated by the imputation approaches of the Dutch Housing Demand Survey, the European Community Household Panel Survey (ECHP) and the new Dutch Structure of Earnings Survey (SES). These examples illustrate the levels of missing data that can be experienced in such surveys and the practical problems associated with choosing an appropriate imputation strategy for key items from each survey.  相似文献   

5.
浅谈我国施工索赔的起因及归责   总被引:1,自引:1,他引:0  
刘玺 《价值工程》2010,29(13):97-97
本文主要介绍建筑工程施工索赔的起因及归责的一些实际经验,对施工索赔的起因及归责起一定参考作用。  相似文献   

6.
Incomplete data is a common problem of survey research. Recent work on multiple imputation techniques has increased analysts’ awareness of the biasing effects of missing data and has also provided a convenient solution. Imputation methods replace non-response with estimates of the unobserved scores. In many instances, however, non-response to a stimulus does not result from measurement problems that inhibit accurate surveying of empirical reality, but from the inapplicability of the survey question. In such cases, existing imputation techniques replace valid non-response with counterfactual estimates of a situation in which the stimulus is applicable to all respondents. This paper suggests an alternative imputation procedure for incomplete data for which no true score exists: multiple complete random imputation, which overcomes the biasing effects of missing data and allows analysts to model respondents’ valid ‘I don’t know’ answers.  相似文献   

7.
在利用含无回答的经济数据建立线性回归模型,选择PMM多重插补法给出无回答的插补值。模拟结果显示,在任意无回答机制下,随着插补重数增大,系数估计量的偏差和均方误差减小不显著。对于任意无回答率,建议插补重数为5。在完全随机无回答机制下,随着无回答率增加,系数估计量的偏差或均方误差增大往往不显著。然而,在随机无回答机制下或在非随机无回答机制下,随着无回答率增加,系数估计量的偏差和均方误差增大往往显著。  相似文献   

8.
Hot deck imputation is a method for handling missing data in which each missing value is replaced with an observed response from a "similar" unit. Despite being used extensively in practice, the theory is not as well developed as that of other imputation methods. We have found that no consensus exists as to the best way to apply the hot deck and obtain inferences from the completed data set. Here we review different forms of the hot deck and existing research on its statistical properties. We describe applications of the hot deck currently in use, including the U.S. Census Bureau's hot deck for the Current Population Survey (CPS). We also provide an extended example of variations of the hot deck applied to the third National Health and Nutrition Examination Survey (NHANES III). Some potential areas for future research are highlighted.  相似文献   

9.
This paper explores the problem of the construction of imputation classes using the score method, sometimes called predictive mean stratification or response propensity stratification, depending on the context. This method was studied in Thomsen (1973) , Little (1986) and Eltinge & Yansaneh (1997) . We use a different framework to evaluate the properties of the resulting imputed estimator of a population mean. In our framework, we condition on the realized sample. This enables us to considerably simplify our theoretical developments in the frequent situation where the boundaries and the number of classes are sample‐dependent. We find that the key factor for reducing the non‐response bias is to form classes homogeneous with respect to the response probabilities and/or the conditional expectation of the variable of interest. In the latter case, the non‐response/imputation variance is also reduced. Finally, we performed a simulation study to fully evaluate various versions of the score method and to compare them with a cross‐classification method, which is frequently used in practice. The results showed the superiority of the score method in general.  相似文献   

10.
Imputation procedures such as fully efficient fractional imputation (FEFI) or multiple imputation (MI) create multiple versions of the missing observations, thereby reflecting uncertainty about their true values. Multiple imputation generates a finite set of imputations through a posterior predictive distribution. Fractional imputation assigns weights to the observed data. The focus of this article is the development of FEFI for partially classified two-way contingency tables. Point estimators and variances of FEFI estimators of population proportions are derived. Simulation results, when data are missing completely at random or missing at random, show that FEFI is comparable in performance to maximum likelihood estimation and multiple imputation and superior to simple stochastic imputation and complete case anlaysis. Methods are illustrated with four data sets.  相似文献   

11.
In many surveys, imputation procedures are used to account for non‐response bias induced by either unit non‐response or item non‐response. Such procedures are optimised (in terms of reducing non‐response bias) when the models include covariates that are highly predictive of both response and outcome variables. To achieve this, we propose a method for selecting sets of covariates used in regression imputation models or to determine imputation cells for one or more outcome variables, using the fraction of missing information (FMI) as obtained via a proxy pattern‐mixture (PMM) model as the key metric. In our variable selection approach, we use the PPM model to obtain a maximum likelihood estimate of the FMI for separate sets of candidate imputation models and look for the point at which changes in the FMI level off and further auxiliary variables do not improve the imputation model. We illustrate our proposed approach using empirical data from the Ohio Medicaid Assessment Survey and from the Service Annual Survey.  相似文献   

12.
李鹏 《物流科技》2012,(1):40-45
20世纪末,第三方物流迅速发展,由传统第三方物流逐渐向现代整合性第三方物流过渡,相应的,第三方物流合同也逐渐从传统的运输、仓储合同向类型结合型的第三方物流合同转化。这种类型结合型的第三方物流合同系无名合同,其裁判依据包括双方之间订立的第三方物流合同本身(包括补充协议、体系解释、交易习惯)和民事法律规范。从双方之间订立的第三方物流合同角度而言,其违约责任多采用的是严格责任归责原则。从民事法律规范的角度来说,基于个案中能否确认货损发生的区间,第三方物流合同适用不同的违约责任的归责原则,或适用《合同法》总则的严格责任原则,或适用损失确认区间适用法律的违约责任的归责原则。  相似文献   

13.
Huisman  Mark 《Quality and Quantity》2000,34(4):331-351
Among the wide variety of procedures to handle missing data, imputingthe missing values is a popular strategy to deal with missing itemresponses. In this paper some simple and easily implemented imputationtechniques like item and person mean substitution, and somehot-deck procedures, are investigated. A simulation study was performed based on responses to items forming a scale to measure a latent trait ofthe respondents. The effects of different imputation procedures onthe estimation of the latent ability of the respondents wereinvestigated, as well as the effect on the estimation of Cronbach'salpha (indicating the reliability of the test) and Loevinger'sH-coefficient (indicating scalability). The results indicate thatprocedures which use the relationships between items perform best,although they tend to overestimate the scale quality.  相似文献   

14.
王清松 《价值工程》2014,(27):323-324
本文作者运用法学理论,结合质检工作实践,认真分析产品社会活动全过程、各环节的责任主体,探讨责任主体的权利与义务(责任),对产品质量责任归责原则的几个问题进行粗浅的分析探讨,以期有助于产品质量法律法规制度研究和立法工作。  相似文献   

15.
In data integration contexts, two statistical agencies seek to merge their separate databases into one file. The agencies also may seek to disseminate data to the public based on the integrated file. These goals may be complicated by the agencies' need to protect the confidentiality of database subjects, which could be at risk during the integration or dissemination stage. This article proposes several approaches based on multiple imputation for disclosure limitation, usually called synthetic data, that could be used to facilitate data integration and dissemination while protecting data confidentiality. It reviews existing methods for obtaining inferences from synthetic data and points out where new methods are needed to implement the data integration proposals.  相似文献   

16.
Empirical count data are often zero‐inflated and overdispersed. Currently, there is no software package that allows adequate imputation of these data. We present multiple‐imputation routines for these kinds of count data based on a Bayesian regression approach or alternatively based on a bootstrap approach that work as add‐ons for the popular multiple imputation by chained equations (mice ) software in R (van Buuren and Groothuis‐Oudshoorn , Journal of Statistical Software, vol. 45, 2011, p. 1). We demonstrate in a Monte Carlo simulation that our procedures are superior to currently available count data procedures. It is emphasized that thorough modeling is essential to obtain plausible imputations and that model mis‐specifications can bias parameter estimates and standard errors quite noticeably. Finally, the strengths and limitations of our procedures are discussed, and fruitful avenues for future theory and software development are outlined.  相似文献   

17.
In this paper, the two-step generalized estimating equations (GEE) approach developed by Wang and Fitzmaurice (Biom J 2:302–318, 2006) is employed to handle income non-responses in the Panel Study of Family Dynamics survey conducted in Taiwan. In our analysis, we first construct a conditional logit model of the paid work equation by taking the missing patterns into account. We then use the estimation results to impute whether or not the nonresponses were working for pay. For those who were imputed or observed to work for pay, we adopt the two-step GEE method to estimate the income equation. Compared to simply deleting the missing cases, the two-step imputation procedure is found to improve the estimation results.  相似文献   

18.
By closely examining the examples provided in Nielsen (2003), this paper further explores the relationship between self-efficiency (Meng, 1994) and the validity of Rubin's multiple imputation (RMI) variance combining rule. The RMI variance combining rule is based on the common assumption/intuition that the efficiency of our estimators decreases when we have less data. However, there are estimation procedures that will do the opposite, that is, they can produce more efficient estimators with less data. Self-efficiency is a theoretical formulation for excluding such procedures. When a user, typically unaware of the hidden self-inefficiency of his choice, adopts a self-inefficient complete-data estimation procedure to conduct an RMI inference, the theoretical validity of his inference becomes a complex issue, as we demonstrate. We also propose a diagnostic tool for assessing potential self-inefficiency and the bias in the RMI variance estimator, at the outset of RMI inference, by constructing a convenient proxy to the RMI point estimator.  相似文献   

19.
This paper considers three ratio estimators of the population mean using known correlation coefficient between the study and auxiliary variables in simple random sample when some sample observations are missing. The suggested estimators are compared with the estimators of Singh and Horn (Metrika 51:267–276, 2000), Singh and Deo (Stat Pap 44:555–579, 2003) and Kadilar and Cingi (Commun Stat Theory Methods 37:2226–2236, 2008). They are compared with other imputation estimators based on the mean or a ratio. It is found that the suggested estimators are approximately unbiased for the population mean. Also, it turns out that the suggested estimators perform well when compared with the other estimators considered in this study.  相似文献   

20.
The missing data problem has been widely addressed in the literature. The traditional methods for handling missing data may be not suited to spatial data, which can exhibit distinctive structures of dependence and/or heterogeneity. As a possible solution to the spatial missing data problem, this paper proposes an approach that combines the Bayesian Interpolation method [Benedetti, R. & Palma, D. (1994) Markov random field-based image subsampling method, Journal of Applied Statistics, 21(5), 495–509] with a multiple imputation procedure. The method is developed in a univariate and a multivariate framework, and its performance is evaluated through an empirical illustration based on data related to labour productivity in European regions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号