首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Withdrawing from a longitudinal investigation is a common problem in epidemiological research. This paper describes a nonparametric method, based on a bootstrap approach, for assessing whether dropouts are missed at random. The basic idea is to compare scores of dropouts and non-dropouts at different assessments using a weighted nonparametric test statistic.A Monte Carlo investigation evaluates the comparative power of the test to violations from populations normality, using three commonly occurring distributions. The test proposed here is more powerful than the parametric counterpart under distributions with extreme skews.The method is applied to a longitudinal community-based study investigating mental disorders. It is found that dropouts did not differ from the other subjects with respect to two psychological variables, although chi-square tests gave some other impressions.  相似文献   

2.
Understanding the transitions between disease states is often the goal in studying chronic disease. These studies, however, are typically subject to a large amount of missingness either due to patient dropout or intermittent missed visits. The missing data is often informative since missingness and dropout are usually related to either an individual's underlying disease process or the actual value of the missed observation. Our motivating example is a study of opiate addiction that examined the effect of a new treatment on thrice-weekly binary urine tests to assess opiate use over follow-up. The interest in this opiate addiction clinical trial was to characterize the transition pattern of opiate use (in each treatment arm) as well as to compare both the marginal probability of a positive urine test over follow-up and the time until the first positive urine test between the treatment arms. We develop a shared random effects model that links together the propensity of transition between states and the probability of either an intermittent missed observation or dropout. This approach allows for heterogeneous transition and missing data patterns between individuals as well as incorporating informative intermittent missing data and dropout. We compare this new approach with other approaches proposed for the analysis of longitudinal binary data with informative missingness.  相似文献   

3.
This paper provides a review of common statistical disclosure control (SDC) methods implemented at statistical agencies for standard tabular outputs containing whole population counts from a census (either enumerated or based on a register). These methods include record swapping on the microdata prior to its tabulation and rounding of entries in the tables after they are produced. The approach for assessing SDC methods is based on a disclosure risk–data utility framework and the need to find a balance between managing disclosure risk while maximizing the amount of information that can be released to users and ensuring high quality outputs. To carry out the analysis, quantitative measures of disclosure risk and data utility are defined and methods compared. Conclusions from the analysis show that record swapping as a sole SDC method leaves high probabilities of disclosure risk. Targeted record swapping lowers the disclosure risk, but there is more distortion of distributions. Small cell adjustments (rounding) give protection to census tables by eliminating small cells but only one set of variables and geographies can be disseminated in order to avoid disclosure by differencing nested tables. Full random rounding offers more protection against disclosure by differencing, but margins are typically rounded separately from the internal cells and tables are not additive. Rounding procedures protect against the perception of disclosure risk compared to record swapping since no small cells appear in the tables. Combining rounding with record swapping raises the level of protection but increases the loss of utility to census tabular outputs. For some statistical analysis, the combination of record swapping and rounding balances to some degree opposing effects that the methods have on the utility of the tables.  相似文献   

4.
Longitudinal data sets with the structure T (time points) × N (subjects) are often incomplete because of data missing for certain subjects at certain time points. The EM algorithm is applied in conjunction with the Kalman smoother for computing maximum likelihood estimates of longitudinal LISREL models from varying missing data patterns. The iterative procedure uses the LISREL program in the M-step and the Kalman smoother in the E-step. The application of the method is illustrated by simulating missing data on a data set from educational research.  相似文献   

5.
Hierarchically structured data are common in many areas of scientific research. Such data are characterized by nested membership relations among the units of observation. Multilevel analysis is a class of methods that explicitly takes the hierarchical structure into account. Repeated measures data can be considered as having a hierarchical structure as well: measurements are nested within, for instance, individuals. In this paper, an overview is given of the multilevel analysis approach to repeated measures data. A simple application to growth curves is provided as an illustration. It is argued that multilevel analysis of repeated measures data is a powerful and attractive approach for several reasons, such as flexibility, and the emphasis on individual development.  相似文献   

6.
Since the work of Little and Rubin (1987) not substantial advances in the analysisof explanatory regression models for incomplete data with missing not at randomhave been achieved, mainly due to the difficulty of verifying the randomness ofthe unknown data. In practice, the analysis of nonrandom missing data is donewith techniques designed for datasets with random or completely random missingdata, as complete case analysis, mean imputation, regression imputation, maximumlikelihood or multiple imputation. However, the data conditions required to minimizethe bias derived from an incorrect analysis have not been fully determined. In thepresent work, several Monte Carlo simulations have been carried out to establishthe best strategy of analysis for random missing data applicable in datasets withnonrandom missing data. The factors involved in simulations are sample size,percentage of missing data, predictive power of the imputation model and existenceof interaction between predictors. The results show that the smallest bias is obtainedwith maximum likelihood and multiple imputation techniques, although with lowpercentages of missing data, absence of interaction and high predictive power ofthe imputation model (frequent data structures in research on child and adolescentpsychopathology) acceptable results are obtained with the simplest regression imputation.  相似文献   

7.
Estimation with longitudinal Y having nonignorable dropout is considered when the joint distribution of Y and covariate X is nonparametric and the dropout propensity conditional on (Y,X) is parametric. We apply the generalised method of moments to estimate the parameters in the nonignorable dropout propensity based on estimating equations constructed using an instrument Z, which is part of X related to Y but unrelated to the dropout propensity conditioned on Y and other covariates. Population means and other parameters in the nonparametric distribution of Y can be estimated based on inverse propensity weighting with estimated propensity. To improve efficiency, we derive a model‐assisted regression estimator making use of information provided by the covariates and previously observed Y‐values in the longitudinal setting. The model‐assisted regression estimator is protected from model misspecification and is asymptotically normal and more efficient when the working models are correct and some other conditions are satisfied. The finite‐sample performance of the estimators is studied through simulation, and an application to the HIV‐CD4 data set is also presented as illustration.  相似文献   

8.
The growth of non‐response rates for social science surveys has led to increased concern about the risk of non‐response bias. Unfortunately, the non‐response rate is a poor indicator of when non‐response bias is likely to occur. We consider in this paper a set of alternative indicators. A large‐scale simulation study is used to explore how each of these indicators performs in a variety of circumstances. Although, as expected, none of the indicators fully depict the impact of non‐response in survey estimates, we discuss how they can be used when creating a plausible account of the risks for non‐response bias for a survey. We also describe an interesting characteristic of the fraction of missing information that may be helpful in diagnosing not‐missing‐at‐random mechanisms in certain situations.  相似文献   

9.
Repeated measures data can be modelled as a two-levelmodel where occasions (level one units) are grouped byindividuals (level two units). Goldstein et al. (1994)proposed a multilevel time series model when theresponse variable follows a Normal distribution andthe measurements are taken with unequal timeintervals. This paper extends the methodology todiscrete response variables. The models are applied toBritish Election Study data consisting of repeatedmeasures of voting intention.  相似文献   

10.
This paper discusses the importance of managing data quality in academic research in its relation to satisfying the customer. This focus is on the data completeness objectivedimension of data quality in relation to recent advancements which have been made in the development of methods for analysing incomplete multivariate data. An overview and comparison of the traditional techniques with the recent advancements are provided. Multiple imputation is also discussed as a method of analysing incomplete multivariate data, which can potentially reduce some of the biases which can occur from using some of the traditional techniques. Despite these recent advancements in the analysis of incomplete multivariate data, evidence is presented which shows that researchers are not using these techniques to manage the data quality of their current research across a variety of academic disciplines. An analysis is then provided as to why these techniques have not been adopted along with suggestions to improve the frequency of their use in the future. Source-Reference. The ideas for this paper originated from research work on David J. Fogarty's Ph.D. dissertation. The subject area is the use of advanced techniques for the imputation of incomplete multivariate data on corporate data warehouses.  相似文献   

11.
This article unifies and extends ideas from nonparametric production analysis and DEA for testing organizational efficiency. We show how the admissible price set can be restricted to account for prior information on prices. These restrictions may relate prices to input and output quantities in order to test noncompetitive behavior of the evaluated decision making unit. While the resulting efficiency tests cannot always be cast into linear programming problems, we discuss various solution strategies for the tests. Thereby we consider the question when does local optimality of the result guarantee global optimality. We also show how the decision maker's preferences, for example ranking information, can be adopted into DEA models in a simple manner. Finally, the approach with price restrictions is illustrated with an application to test noncompetitive behavior of the pulp and paper industries in Finland.  相似文献   

12.
This paper outlines a strategy to validate multiple imputation methods. Rubin's criteria for proper multiple imputation are the point of departure. We describe a simulation method that yields insight into various aspects of bias and efficiency of the imputation process. We propose a new method for creating incomplete data under a general Missing At Random (MAR) mechanism. Software implementing the validation strategy is available as a SAS/IML module. The method is applied to investigate the behavior of polytomous regression imputation for categorical data.  相似文献   

13.
In most surveys, one is confronted with missing or, more generally, coarse data. Traditional methods dealing with these data require strong, untestable and often doubtful assumptions, for example, coarsening at random. But due to the resulting, potentially severe bias, there is a growing interest in approaches that only include tenable knowledge about the coarsening process, leading to imprecise but reliable results. In this spirit, we study regression analysis with a coarse categorical‐dependent variable and precisely observed categorical covariates. Our (profile) likelihood‐based approach can incorporate weak knowledge about the coarsening process and thus offers a synthesis of traditional methods and cautious strategies refraining from any coarsening assumptions. This also allows a discussion of the uncertainty about the coarsening process, besides sampling uncertainty and model uncertainty. Our procedure is illustrated with data of the panel study ‘Labour market and social security' conducted by the Institute for Employment Research, whose questionnaire design produces coarse data.  相似文献   

14.
王少波 《价值工程》2014,(34):30-31
当前,窃电现象屡禁不止,而且窃电手段越来越多、技术含量越来越高,盗窃形式更为隐蔽,给防范和查处工作带来了更大的难度。因此,作为国家、供电企业必须充分认识当前防范窃电工作的严峻形势、重要程度,将该项工作作为系统工程,切实从法律上做到约束、从实际工作中做好防范、从现实社会中严厉整治和打击,为电力能源的高效、优质供应打下坚实的基础。  相似文献   

15.
We combine the k‐Nearest Neighbors (kNN) method to the local linear estimation (LLE) approach to construct a new estimator (LLE‐kNN) of the regression operator when the regressor is of functional type and the response variable is a scalar but observed with some missing at random (MAR) observations. The resulting estimator inherits many of the advantages of both approaches (kNN and LLE methods). This is confirmed by the established asymptotic results, in terms of the pointwise and uniform almost complete consistencies, and the precise convergence rates. In addition, a numerical study (i) on simulated data, then (ii) on a real dataset concerning the sugar quality using fluorescence data, were conducted. This practical study clearly shows the feasibility and the superiority of the LLE‐kNN estimator compared to competitive estimators.  相似文献   

16.
林茂 《价值工程》2011,30(19):158-159
在虚拟现实项目制作中,由于种种原因,海量数据处理是一项艰巨而复杂的任务,本文主要论述了海量数据处理困难的原因,并提出了对海量数据进行处理的方法。  相似文献   

17.
戴宝印 《物流科技》2009,32(5):132-136
船舶制造行业作为传统的制造业,受金融危机的影响已日趋显现。文章将借助SWOT分析法,分析金融危机背号下我国造船行业的优势、劣势以及所面临的机遇和威胁,并对我国造船行业在金融危机背景下的对策进行了探讨,提出了一些建议。  相似文献   

18.
Small area estimation is a widely used indirect estimation technique for micro‐level geographic profiling. Three unit level small area estimation techniques—the ELL or World Bank method, empirical best prediction (EBP) and M‐quantile (MQ) — can estimate micro‐level Foster, Greer, & Thorbecke (FGT) indicators: poverty incidence, gap and severity using both unit level survey and census data. However, they use different assumptions. The effects of using model‐based unit level census data reconstructed from cross‐tabulations and having no cluster level contextual variables for models are discussed, as are effects of small area and cluster level heterogeneity. A simulation‐based comparison of ELL, EBP and MQ uses a model‐based reconstruction of 2000/2001 data from Bangladesh and compares bias and mean square error. A three‐level ELL method is applied for comparison with the standard two‐level ELL that lacks a small area level component. An important finding is that the larger number of small areas for which ELL has been able to produce sufficiently accurate estimates in comparison with EBP and MQ has been driven more by the type of census data available or utilised than by the model per se.  相似文献   

19.
In spite of the abundance of clustering techniques and algorithms, clustering mixed interval (continuous) and categorical (nominal and/or ordinal) scale data remain a challenging problem. In order to identify the most effective approaches for clustering mixed‐type data, we use both theoretical and empirical analyses to present a critical review of the strengths and weaknesses of the methods identified in the literature. Guidelines on approaches to use under different scenarios are provided, along with potential directions for future research.  相似文献   

20.
随着科学技术的发展,水位原始观测在不增加人力及很大投资的情况下可以实现真正意义上的"三遍手",保证水位观测成果的质量。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号