共查询到20条相似文献,搜索用时 0 毫秒
1.
When handling missing data, a researcher should be aware of the mechanism underlying the missingness. In the presence of non-randomly missing data, a model of the missing data mechanism should be included in the analyses to prevent the analyses based on the data from becoming biased. Modeling the missing data mechanism, however, is a difficult task. One way in which knowledge about the missing data mechanism may be obtained is by collecting additional data from non-respondents. In this paper the method of re-approaching respondents who did not answer all questions of a questionnaire is described. New answers were obtained from a sample of these non-respondents and the reason(s) for skipping questions was (were) probed for. The additional data resulted in a larger sample and was used to investigate the differences between respondents and non-respondents, whereas probing for the causes of missingness resulted in more knowledge about the nature of the missing data patterns. 相似文献
2.
One of the most difficult problems confronting investigators who analyze data from surveys is how treat missing data. Many
statistical procedures can not be used immediately if any values are missing. This paper considers the problem of estimating
the population mean using auxiliary information when some observations on the sample are missing and the population mean of
the auxiliary variable is not available. We use tools of classical statistical estimation theory to find a suitable estimator.
We study the model and design properties of the proposed estimator. We also report the results of a broad-based simulation
study of the efficiency of the estimator, which reveals very promising results. 相似文献
3.
Since the work of Little and Rubin (1987) not substantial advances in the analysisof explanatory regression models for incomplete data with missing not at randomhave been achieved, mainly due to the difficulty of verifying the randomness ofthe unknown data. In practice, the analysis of nonrandom missing data is donewith techniques designed for datasets with random or completely random missingdata, as complete case analysis, mean imputation, regression imputation, maximumlikelihood or multiple imputation. However, the data conditions required to minimizethe bias derived from an incorrect analysis have not been fully determined. In thepresent work, several Monte Carlo simulations have been carried out to establishthe best strategy of analysis for random missing data applicable in datasets withnonrandom missing data. The factors involved in simulations are sample size,percentage of missing data, predictive power of the imputation model and existenceof interaction between predictors. The results show that the smallest bias is obtainedwith maximum likelihood and multiple imputation techniques, although with lowpercentages of missing data, absence of interaction and high predictive power ofthe imputation model (frequent data structures in research on child and adolescentpsychopathology) acceptable results are obtained with the simplest regression imputation. 相似文献
4.
Martin Kroh 《Quality and Quantity》2006,40(2):225-244
Incomplete data is a common problem of survey research. Recent work on multiple imputation techniques has increased analysts’
awareness of the biasing effects of missing data and has also provided a convenient solution. Imputation methods replace non-response
with estimates of the unobserved scores. In many instances, however, non-response to a stimulus does not result from measurement
problems that inhibit accurate surveying of empirical reality, but from the inapplicability of the survey question. In such
cases, existing imputation techniques replace valid non-response with counterfactual estimates of a situation in which the
stimulus is applicable to all respondents. This paper suggests an alternative imputation procedure for incomplete data for
which no true score exists: multiple complete random imputation, which overcomes the biasing effects of missing data and allows
analysts to model respondents’ valid ‘I don’t know’ answers. 相似文献
5.
Joseph L. Schafer 《Statistica Neerlandica》2003,57(1):19-35
Bayesian multiple imputation (MI) has become a highly useful paradigm for handling missing values in many settings. In this paper, I compare Bayesian MI with other methods – maximum likelihood, in particular—and point out some of its unique features. One key aspect of MI, the separation of the imputation phase from the analysis phase, can be advantageous in settings where the models underlying the two phases do not agree. 相似文献
6.
Among the wide variety of procedures to handle missing data, imputingthe missing values is a popular strategy to deal with missing itemresponses. In this paper some simple and easily implemented imputationtechniques like item and person mean substitution, and somehot-deck procedures, are investigated. A simulation study was performed based on responses to items forming a scale to measure a latent trait ofthe respondents. The effects of different imputation procedures onthe estimation of the latent ability of the respondents wereinvestigated, as well as the effect on the estimation of Cronbach'salpha (indicating the reliability of the test) and Loevinger'sH-coefficient (indicating scalability). The results indicate thatprocedures which use the relationships between items perform best,although they tend to overestimate the scale quality. 相似文献
7.
This paper discusses the importance of managing data quality in academic research in its relation to satisfying the customer. This focus is on the data completeness objectivedimension of data quality in relation to recent advancements which have been made in the development of methods for analysing incomplete multivariate data. An overview and comparison of the traditional techniques with the recent advancements are provided. Multiple imputation is also discussed as a method of analysing incomplete multivariate data, which can potentially reduce some of the biases which can occur from using some of the traditional techniques. Despite these recent advancements in the analysis of incomplete multivariate data, evidence is presented which shows that researchers are not using these techniques to manage the data quality of their current research across a variety of academic disciplines. An analysis is then provided as to why these techniques have not been adopted along with suggestions to improve the frequency of their use in the future.
Source-Reference. The ideas for this paper originated from research work on David J. Fogarty's Ph.D. dissertation. The subject area is the use of advanced techniques for the imputation of incomplete multivariate data on corporate data warehouses. 相似文献
8.
Although item nonresponse can never be totally prevented, it can be considerably reduced, and thereby provide the researcher with not only more useable data, but also with helpful auxiliary information for a better imputation and adjustment. To achieve this an optimal data collection design is necessary. The optimization of the questionnaire and survey design are the main tools a researcher has to reduce the number of missing data in any such survey. In this contribution a concise typology of missing data patterns and their sources of origin are presented. Based on this typology, the mechanisms responsible for missing data are identified, followed by a discussion on how item nonresponse can be prevented. 相似文献
9.
本文主要依托高速公路铣刨修复专项养护工程,通过对专项养护工程施工中沥青混合料质量检测,总结出沥青混合料生产质量通病并提出改进建议。 相似文献
10.
A Random Effects Transition Model For Longitudinal Binary Data With Informative Missingness 总被引:1,自引:0,他引:1
Understanding the transitions between disease states is often the goal in studying chronic disease. These studies, however, are typically subject to a large amount of missingness either due to patient dropout or intermittent missed visits. The missing data is often informative since missingness and dropout are usually related to either an individual's underlying disease process or the actual value of the missed observation. Our motivating example is a study of opiate addiction that examined the effect of a new treatment on thrice-weekly binary urine tests to assess opiate use over follow-up. The interest in this opiate addiction clinical trial was to characterize the transition pattern of opiate use (in each treatment arm) as well as to compare both the marginal probability of a positive urine test over follow-up and the time until the first positive urine test between the treatment arms. We develop a shared random effects model that links together the propensity of transition between states and the probability of either an intermittent missed observation or dropout. This approach allows for heterogeneous transition and missing data patterns between individuals as well as incorporating informative intermittent missing data and dropout. We compare this new approach with other approaches proposed for the analysis of longitudinal binary data with informative missingness. 相似文献
11.
针对当前刑侦海量档案数据信息,首先在分析其数据跨平台、复杂化和多样性特点的基础上,设计了刑侦数据仓库的概念模型、逻辑模型和物理模型;接着针对刑侦数据仓库及数据挖掘技术,对已有的刑侦档案数据进行信息整合和数据挖掘,获取大量的有用知识,这些知识在促进刑侦研究工作的同时,对一线的实际刑侦工作具有很大的参考价值;最后,文章给出了面向刑侦档案数据信息的仓库模型,针对其数据挖掘系统框架提出了相应的数据挖掘方法,为进一步的刑侦数据信息联机分析处理和有用信息挖掘以及为公安安全防范决策服务。 相似文献
12.
Arden Hall 《Journal of Housing Economics》2000,9(4):49
Burnout is a consequence of unobservable predictive variables. This paper describes a methodology for estimating mortgage prepayment models which corrects for burnout. The paper generalizes the approach of Deng, Quigley, and Van Order (Econometrica, 68, 275–307, 1998) and Stanton (Rev. Finan. Stud.8, 677–708, 1995) in modeling the impact of unobservable variables as a probability distribution. The estimator is applied to a sample of loan histories and the results compared to a conventional logit analysis of the data. Predictions and simulations from both models are compared to illustrate the properties of the new estimator. 相似文献
13.
14.
历史经验表明,经济危机往往孕育着新的科技革命。正是科技上的重大突破和创新,推动经济结构的重大调整,提供新的增长引擎,使经济重新恢复平衡并提升到更高水平。道路建设中的沥青新型材料就是在科技的引领下,不断推陈逝新。 相似文献
15.
Tamás Rudas 《Metrika》1999,50(2):163-172
A measure of the fit of a statistical model can be obtained by estimating the relative size of the largest fraction of the population where a distribution belonging to the model may be valid. This is the mixture index of fit that was suggested for models for contingency tables by Rudas, Clogg, Lindsay (1994) and it is extended here for models involving continuous observations. In particular, the approach is applied to regression models with normal and uniform error structures. Best fit, as measured by the mixture index of fit, is obtained with minimax estimation of the regression parameters. Therefore, whenever minimax estimation is used for these problems, the mixture index of fit provides a natural approach for measuring model fit and for variable selection. Received: September 1997 相似文献
16.
通过温拌沥青混合料(WMA)在实际工程中的应用,说明WMA与热拌沥青混合料(HMA)相比具有能源消耗低、废气排放少、提高沥青混合料抗车辙能力和延长沥青路面施工的季节等特点。适应于较低温度下的拌和、压实,易于施工,特别适合城市道路的路面。 相似文献
17.
Repeated measurements often are analyzed by multivariate analysis of variance (MANOVA). An alternative approach is provided by multilevel analysis, also called the hierarchical linear model (HLM), which makes use of random coefficient models. This paper is a tutorial which indicates that the HLM can be specified in many different ways, corresponding to different sets of assumptions about the covariance matrix of the
repeated measurements. The possible assumptions range from the very restrictive compound symmetry model to the unrestricted
multivariate model. Thus, the HLM can be used to steer a useful middle road between the two traditional methods for analyzing repeated measurements. Another
important advantage of the multilevel approach to analyzing repeated measures is the fact that it can be easily used also
if the data are incomplete. Thus it provides a way to achieve a fully multivariate analysis of repeated measures with incomplete
data.
This revised version was published online in June 2006 with corrections to the Cover Date. 相似文献
18.
Repeated measures data can be modelled as a two-levelmodel where occasions (level one units) are grouped byindividuals (level two units). Goldstein et al. (1994)proposed a multilevel time series model when theresponse variable follows a Normal distribution andthe measurements are taken with unequal timeintervals. This paper extends the methodology todiscrete response variables. The models are applied toBritish Election Study data consisting of repeatedmeasures of voting intention. 相似文献
19.
物流外包行业本身经营的是一种承诺,它是经营风险、经营信用的特殊行业,这种经营能持续发展壮大的基础就在于双方当事人的诚信。有诚信,客户、物流外包公司、外包商之间关系才会正常,外包市场才会稳定有序地发展。然而,当前社会上却存在着严重的失信、败信问题,这已经成为制约企业发展的瓶颈因素。文章首先对我国物流市场的现状及成因进行了分析,最后对完善我国物流市场诚信体系提出了一些对策。 相似文献
20.
Geert Molenberghs Herbert Thijs Michael G. Kenward Geert Verbeke 《Statistica Neerlandica》2003,57(1):112-135
Even though models for incomplete longitudinal data are in common use, they are surrounded with problems, largely due to the untestable nature of the assumptions one has to make regarding the missingness mechanism. Two extreme views on how to deal with this problem are (1) to avoid incomplete data altogether and (2) to construct ever more complicated joint models for the measurement and missingness processes. In this paper, it is argued that a more versatile approach is to embed the treatment of incomplete data within a sensitivity analysis. Several such sensitivity analysis routes are presented and applied to a case study, the milk protein trial analyzed before by Diggle and Kenward (1994) . Apart from the use of local influence methods, some emphasis is put on pattern-mixture modeling. In the latter case, it is shown how multiple-imputation ideas can be used to define a practically feasible modeling strategy. 相似文献