期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Methods for Handling Dropouts in Longitudinal Clinical Trials

Garrett M. Fitzmaurice 《Statistica Neerlandica》2003,57(1):75-99

This paper focuses on the monotone missing data patterns produced by dropouts and presents a review of the statistical literature on approaches for handling dropouts in longitudinal clinical trials. A variety of ad hoc procedures for handling dropouts are widely used. The rationale for many of these procedures is not well-founded and they can result in biased estimates of treatment comparisons. A fundamentally difficult problem arises when the probability of dropout is thought to be related to the specific value that in principle should have been obtained; this is often referred to as informative or non-ignorable dropout. Joint models for the longitudinal outcomes and the dropout times have been proposed in order to make corrections for non-ignorable dropouts. Two broad classes of joint models are reviewed: selection models and pattern-mixture models. Finally, when there are dropouts in a longitudinal clinical trial the goals of the analysis need to be clearly specified. In this paper we review the main distinctions between a 'pragmatic' and an 'explanatory' analysis. We note that many of the procedures for handling dropouts that are widely used in practice come closest to producing an explanatory rather than a pragmatic analysis. 相似文献

2.

Handling Missing Data by Re-approaching Non-respondents

Huisman Mark Krol Boudien Van Sonderen Eric 《Quality and Quantity》1998,32(1):77-91

When handling missing data, a researcher should be aware of the mechanism underlying the missingness. In the presence of non-randomly missing data, a model of the missing data mechanism should be included in the analyses to prevent the analyses based on the data from becoming biased. Modeling the missing data mechanism, however, is a difficult task. One way in which knowledge about the missing data mechanism may be obtained is by collecting additional data from non-respondents. In this paper the method of re-approaching respondents who did not answer all questions of a questionnaire is described. New answers were obtained from a sample of these non-respondents and the reason(s) for skipping questions was (were) probed for. The additional data resulted in a larger sample and was used to investigate the differences between respondents and non-respondents, whereas probing for the causes of missingness resulted in more knowledge about the nature of the missing data patterns. 相似文献

3.

Missing Data: A Unified Taxonomy Guided by Conditional Independence

下载免费PDF全文

Marco Doretti Sara Geneletti Elena Stanghellini 《Revue internationale de statistique》2018,86(2):189-204

Recent work (Seaman et al., 2013 ; Mealli & Rubin, 2015 ) attempts to clarify the not always well‐understood difference between realised and everywhere definitions of missing at random (MAR) and missing completely at random. Another branch of the literature (Mohan et al., 2013 ; Pearl & Mohan, 2013 ) exploits always‐observed covariates to give variable‐based definitions of MAR and missing completely at random. In this paper, we develop a unified taxonomy encompassing all approaches. In this taxonomy, the new concept of ‘complementary MAR’ is introduced, and its relationship with the concept of data observed at random is discussed. All relationships among these definitions are analysed and represented graphically. Conditional independence, both at the random variable and at the event level, is the formal language we adopt to connect all these definitions. Our paper covers both the univariate and the multivariate case, where attention is paid to monotone missingness and to the concept of sequential MAR. Specifically, for monotone missingness, we propose a sequential MAR definition that might be more appropriate than both everywhere and variable‐based MAR to model dropout in certain contexts. 相似文献

4.

Monotone missing data and pattern-mixture models

G. Molenberghs B. Michiels M. G. Kenward & P. J. Diggle 《Statistica Neerlandica》1998,52(2):153-161

It is shown that the classical taxonomy of missing data models, namely missing completely at random, missing at random and informative missingness, which has been developed almost exclusively within a selection modelling framework, can also be applied to pattern-mixture models. In particular, intuitively appealing identifying restrictions are proposed for a pattern-mixture MAR mechanism. 相似文献

5.

Estimating Transition Probabilities from a Time Series of Independent Cross Sections

Ben Pelzer Rob Eisinga & Philip Hans Franses 《Statistica Neerlandica》2001,55(2):249-262

This paper considers the implementation of a nonstationary, heterogeneous Markov model for the analysis of a binary dependent variable in a time series of independent cross sections. The model, previously considered by M offitt (1993), offers the opportunity to estimate entry and exit transition probabilities and to examine the effects of time-constant and time-varying covariates on the hazards. We show how ML estimates of the parameters can be obtained by Fisher's method-of-scoring and how to estimate both fixed and time-varying covariate effects. The model is exemplified with an analysis of the labor force participation decision of Dutch women using data from the Socio-economic Panel (SEP) study conducted in the Netherlands between 1986 and 1995. We treat the panel data as independent cross sections and compare the employment status sequences predicted by the model with the observed sequences in the panel. Some open problems concerning the application of the model are also discussed. 相似文献

6.

Confidence Intervals for the Area Under the Receiver Operating Characteristic Curve in the Presence of Ignorable Missing Data

Hunyong Cho Gregory J. Matthews Ofer Harel 《Revue internationale de statistique》2019,87(1):152-177

Receiver operating characteristic curves are widely used as a measure of accuracy of diagnostic tests and can be summarised using the area under the receiver operating characteristic curve (AUC). Often, it is useful to construct a confidence interval for the AUC; however, because there are a number of different proposed methods to measure variance of the AUC, there are thus many different resulting methods for constructing these intervals. In this article, we compare different methods of constructing Wald‐type confidence interval in the presence of missing data where the missingness mechanism is ignorable. We find that constructing confidence intervals using multiple imputation based on logistic regression gives the most robust coverage probability and the choice of confidence interval method is less important. However, when missingness rate is less severe (e.g. less than 70%), we recommend using Newcombe's Wald method for constructing confidence intervals along with multiple imputation using predictive mean matching. 相似文献

7.

Mixed-effects models for health care longitudinal data with an informative visiting process: A Monte Carlo simulation study

Alessandro Gasparini Keith R. Abrams Jessica K. Barrett Rupert W. Major Michael J. Sweeting Nigel J. Brunskill Michael J. Crowther 《Statistica Neerlandica》2020,74(1):5-23

Electronic health records are being increasingly used in medical research to answer more relevant and detailed clinical questions; however, they pose new and significant methodological challenges. For instance, observation times are likely correlated with the underlying disease severity: Patients with worse conditions utilise health care more and may have worse biomarker values recorded. Traditional methods for analysing longitudinal data assume independence between observation times and disease severity; yet, with health care data, such assumptions unlikely hold. Through Monte Carlo simulation, we compare different analytical approaches proposed to account for an informative visiting process to assess whether they lead to unbiased results. Furthermore, we formalise a joint model for the observation process and the longitudinal outcome within an extended joint modelling framework. We illustrate our results using data from a pragmatic trial on enhanced care for individuals with chronic kidney disease, and we introduce user-friendly software that can be used to fit the joint model for the observation process and a longitudinal outcome. 相似文献

8.

Predictive mean matching imputation of semicontinuous variables

Gerko Vink Laurence E. Frank Jeroen Pannekoek Stef van Buuren 《Statistica Neerlandica》2014,68(1):61-90

Multiple imputation methods properly account for the uncertainty of missing data. One of those methods for creating multiple imputations is predictive mean matching (PMM), a general purpose method. Little is known about the performance of PMM in imputing non‐normal semicontinuous data (skewed data with a point mass at a certain value and otherwise continuously distributed). We investigate the performance of PMM as well as dedicated methods for imputing semicontinuous data by performing simulation studies under univariate and multivariate missingness mechanisms. We also investigate the performance on real‐life datasets. We conclude that PMM performance is at least as good as the investigated dedicated methods for imputing semicontinuous data and, in contrast to other methods, is the only method that yields plausible imputations and preserves the original data distributions. 相似文献

9.

Generalized linear mixed models with informative dropouts and missing covariates

Kunling Wu Lang Wu 《Metrika》2007,66(1):1-18

Generalized linear mixed models (GLMM) are useful in many longitudinal data analyses. In the presence of informative dropouts and missing covariates, however, standard complete-data methods may not be applicable. In this article, we consider a likelihood method and an approximate method for GLMM with informative dropouts and missing covariates. The methods are implemented by Monte–Carlo EM algorithms combined with Gibbs sampler. The approximate method may lead to inconsistent estimators but is computationally more efficient than the likelihood method. The two methods are evaluated via a simulation study for longitudinal binary data, and appear to perform reasonably well. A dataset on mental distress is analyzed in details. 相似文献

10.

Sensitivity Analysis of Continuous Incomplete Longitudinal Outcomes 总被引：1，自引：0，他引：1

Geert Molenberghs Herbert Thijs Michael G. Kenward Geert Verbeke 《Statistica Neerlandica》2003,57(1):112-135

Even though models for incomplete longitudinal data are in common use, they are surrounded with problems, largely due to the untestable nature of the assumptions one has to make regarding the missingness mechanism. Two extreme views on how to deal with this problem are (1) to avoid incomplete data altogether and (2) to construct ever more complicated joint models for the measurement and missingness processes. In this paper, it is argued that a more versatile approach is to embed the treatment of incomplete data within a sensitivity analysis. Several such sensitivity analysis routes are presented and applied to a case study, the milk protein trial analyzed before by Diggle and Kenward (1994) . Apart from the use of local influence methods, some emphasis is put on pattern-mixture modeling. In the latter case, it is shown how multiple-imputation ideas can be used to define a practically feasible modeling strategy. 相似文献

11.

Methods for Generating Longitudinally Correlated Binary Data

Patrick J. Farrell Katrina Rogers-Stewart 《Revue internationale de statistique》2008,76(1):28-38

The analysis of longitudinally correlated binary data has attracted considerable attention of late. Since the estimation of parameters in models for such data is based on asymptotic theory, it is necessary to investigate the small‐sample properties of estimators by simulation. In this paper, we review the mechanisms that have been proposed for generating longitudinally correlated binary data. We compare and contrast these models with regard to various features, including computational efficiency, flexibility and the range restrictions that they impose on the longitudinal association parameters. Some extensions to the data generation mechanism originally suggested by Kanter (1975) are proposed. 相似文献

12.

Measuring the impact of nonignorability in panel data with non‐monotone nonresponse

Hui Xie Dr Yi Qian 《Journal of Applied Econometrics》2012,27(1):129-159

The analysis of panel data with non‐monotone nonresponse often relies on the critical and untestable assumption of ignorable missingness. It is important to assess the consequences of departures from the ignorability assumption. Non‐monotone nonresponse, however, can often make such sensitivity analysis infeasible because the likelihood functions for alternative models involve high‐dimensional and difficult‐to‐evaluate integrals with respect to missing outcomes. We develop an extension of the local sensitivity method that overcomes computational difficulty and completely avoids fitting alternative models and evaluating these high‐dimensional integrals. The proposed method is applicable to a wide range of panel outcomes. We apply the method to a Smoking Trend dataset where we relax the standard ignorability assumption and evaluate how smoking‐trend estimates in different groups of US young adults are affected by alternative assumptions about the missing‐data mechanism. The main finding is that the standard estimate in the black male group is sensitive to nonignorable missingness but those in other groups are reasonably robust. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献

13.

A Bootstrap Method To Test If Study Dropouts Are Missing Randomly

Schmitz Norbert Franz Matthias 《Quality and Quantity》2002,36(1):1-16

Withdrawing from a longitudinal investigation is a common problem in epidemiological research. This paper describes a nonparametric method, based on a bootstrap approach, for assessing whether dropouts are missed at random. The basic idea is to compare scores of dropouts and non-dropouts at different assessments using a weighted nonparametric test statistic.A Monte Carlo investigation evaluates the comparative power of the test to violations from populations normality, using three commonly occurring distributions. The test proposed here is more powerful than the parametric counterpart under distributions with extreme skews.The method is applied to a longitudinal community-based study investigating mental disorders. It is found that dropouts did not differ from the other subjects with respect to two psychological variables, although chi-square tests gave some other impressions. 相似文献

14.

“Panelizing” Repeated Cross Sections

Ben?Pelzer Rob?Eisinga Email author Philip?Hans?Franses 《Quality and Quantity》2005,39(2):155-174

This paper considers the implementation of a non-stationary, heterogeneous Markov model for the analysis of binary dependent variables in a time series of repeated cross-sectional (RCS) surveys. The model offers the opportunity to estimate entry and exit transition probabilities and to examine the effects of time-constant and time-varying covariates on the hazards. We show how maximum likelihood estimates of the parameters can be obtained by Fishers method-of-scoring and how to estimate both fixed and time-varying covariate effects. The model is exemplified with an analysis of the labor force participation decision of Dutch and West German women using ISSP (and other) data from 10 annual Dutch surveys conducted between 1987 and 1996 and 7 annual West German surveys conducted between 1988 and 1994. Some open problems concerning the application of the model are discussed. 相似文献

15.

Dropout in secondary education: an application of a multilevel discrete-time hazard model accounting for school changes

Carl Lamote Jan Van Damme Wim Van Den Noortgate Sara Speybroeck Tinneke Boonen Jerissa de Bilde 《Quality and Quantity》2013,47(5):2425-2446

For several decades, researchers have focused on dropout in search for an explanation and prevention of this phenomenon. However, past research is characterized by methodological shortcomings. Most of this research was conducted without considering the hierarchical structure of educational data and ignored the longitudinal path towards dropout. Moreover, research that did take into account these shortcomings, did not correct for student mobility between schools, despite the strong correlation with dropout (South et al. 2007). In this study, we attempt to address these shortcoming by implementing a multilevel discrete-time hazard model and exploring the effect of different school classifications on the school effects. Partially analogous to Grady and Beretvas (2010) we compare models with estimated school effects based on the first and on the last school attended and compare these models with multiple membership models and cross-classified models. The results of this comparison indicate that ignoring student mobility can have strong implications on the predictors of dropout. Not only do models which take into account this mobility yield better model fits, models ignoring this mobility tend to miss the effect of school level variables. With respect to the conclusions on dropout research, our models provide evidence for the often cited student characteristics predicting dropout and indicate stronger school effects than generally assumed. 相似文献

16.

Multilevel Analysis of Repeated Measures Data

Van Der Leeden Rien 《Quality and Quantity》1998,32(1):15-29

Hierarchically structured data are common in many areas of scientific research. Such data are characterized by nested membership relations among the units of observation. Multilevel analysis is a class of methods that explicitly takes the hierarchical structure into account. Repeated measures data can be considered as having a hierarchical structure as well: measurements are nested within, for instance, individuals. In this paper, an overview is given of the multilevel analysis approach to repeated measures data. A simple application to growth curves is provided as an illustration. It is argued that multilevel analysis of repeated measures data is a powerful and attractive approach for several reasons, such as flexibility, and the emphasis on individual development. 相似文献

17.

An empirical study on credit card loan delinquency

Hyeongjun Kim Hoon Cho Doojin Ryu 《Economic Systems》2018,42(3):437-449

Following the Basel II convention, consumer credit default is commonly defined as delinquency beyond a period of 90 days. In this study, rather than considering default as a binary variable, we dissect delinquency states further to investigate default behavior in greater detail. As such, we define three states—no delinquency, delinquency and serious delinquency—and estimate the probabilities of the transitions between states using extensive panel data from Korea, covering a wide range of behavioral information. Our findings have several economic implications. First, the factors that affect delinquency risk can differ from those that affect the transition from delinquency to serious delinquency. Second, the recent increase in the number of seriously delinquent accounts can be attributed to changes in the borrower age distribution. Third, macroeconomic conditions, especially differences in gross domestic product and consumption growth, have led to the recent increase in delinquent accounts. Fourth, the debt-to-income (DTI) ratio has a profound effect on transitions between delinquency states and thus affects both recovery and delinquency. Furthermore, this result is robust to controls for demographic and macroeconomic factors. 相似文献

18.

大坝可视化数据的存储与管理

何小苑《中国工程师》2014,(5):61-62

针对大坝可视化处理中的监测和分析数据量较大,用于观测分析的成果、属性以及图形等数据之间的关系较为复杂的问题,提出了观测大数据的存储和管理模式,采用了测点时序索引表的数据管理方法,较好地解决了数据可视化检索速度慢的问题,便于数据处理和绘图所需的灵活存取,实现快速可视化。相似文献

19.

Inferential Implications of Over‐Parametrization: A Case Study in Incomplete Categorical Data

Frederico Z. Poleto Carlos D. Paulino Geert Molenberghs Julio M. Singer 《Revue internationale de statistique》2011,79(1):92-113

In the context of either Bayesian or classical sensitivity analyses of over‐parametrized models for incomplete categorical data, it is well known that prior‐dependence on posterior inferences of nonidentifiable parameters or that too parsimonious over‐parametrized models may lead to erroneous conclusions. Nevertheless, some authors either pay no attention to which parameters are nonidentifiable or do not appropriately account for possible prior‐dependence. We review the literature on this topic and consider simple examples to emphasize that in both inferential frameworks, the subjective components can influence results in nontrivial ways, irrespectively of the sample size. Specifically, we show that prior distributions commonly regarded as slightly informative or noninformative may actually be too informative for nonidentifiable parameters, and that the choice of over‐parametrized models may drastically impact the results, suggesting that a careful examination of their effects should be considered before drawing conclusions. 相似文献

20.

Binary response panel data models with sample selection and self‐selection

Anastasia Semykina Jeffrey M. Wooldridge 《Journal of Applied Econometrics》2018,33(2):179-197

We consider estimating binary response models on an unbalanced panel, where the outcome of the dependent variable may be missing due to nonrandom selection, or there is self‐selection into a treatment. In the present paper, we first consider estimation of sample selection models and treatment effects using a fully parametric approach, where the error distribution is assumed to be normal in both primary and selection equations. Arbitrary time dependence in errors is permitted. Estimation of both coefficients and partial effects, as well as tests for selection bias, are discussed. Furthermore, we consider a semiparametric estimator of binary response panel data models with sample selection that is robust to a variety of error distributions. The estimator employs a control function approach to account for endogenous selection and permits consistent estimation of scaled coefficients and relative effects. 相似文献