共查询到20条相似文献,搜索用时 0 毫秒
1.
Efficiency. of infinite dimensional M- estimators 总被引:2,自引:0,他引:2
It is well-known that maximum likelihood estimators are asymptotically normal with covariance equal to the inverse Fisher information in smooth, finite dimensional parametric models. Thus they are asymptotically efficient. A similar phenomenon has been observed for certain infinite dimensional parameter spaces. We give a simple proof of efficiency, starting from a theorem on asymptotic normality of infinite dimensional M -estimators. The proof avoids the explicit calculation of the Fisher information. We also address Hadamard differentiability of the corresponding M -functionals. 相似文献
2.
Since the work of Little and Rubin (1987) not substantial advances in the analysisof explanatory regression models for incomplete data with missing not at randomhave been achieved, mainly due to the difficulty of verifying the randomness ofthe unknown data. In practice, the analysis of nonrandom missing data is donewith techniques designed for datasets with random or completely random missingdata, as complete case analysis, mean imputation, regression imputation, maximumlikelihood or multiple imputation. However, the data conditions required to minimizethe bias derived from an incorrect analysis have not been fully determined. In thepresent work, several Monte Carlo simulations have been carried out to establishthe best strategy of analysis for random missing data applicable in datasets withnonrandom missing data. The factors involved in simulations are sample size,percentage of missing data, predictive power of the imputation model and existenceof interaction between predictors. The results show that the smallest bias is obtainedwith maximum likelihood and multiple imputation techniques, although with lowpercentages of missing data, absence of interaction and high predictive power ofthe imputation model (frequent data structures in research on child and adolescentpsychopathology) acceptable results are obtained with the simplest regression imputation. 相似文献
3.
For contingency tables with extensive missing data, the unrestricted MLE under the saturated model, computed by the EM algorithm,
is generally unsatisfactory. In this case, it may be better to fit a simpler model by imposing some restrictions on the parameter
space. Perlman and Wu (1999) propose lattice conditional independence (LCI) models for contingency tables with arbitrary missing data patterns. When this LCI model fits well, the restricted MLE under the LCI model is more accurate than the unrestricted
MLE under the saturated model, but not in general. Here we propose certain empirical Bayes (EB) estimators that adaptively
combine the best features of the restricted and unrestricted MLEs. These EB estimators appear to be especially useful when
the observed data is sparse, even in cases where the suitability of the LCI model is uncertain. We also study a restricted
EM algorithm (called the ER algorithm) with similar desirable features.
Received: July 1999 相似文献
4.
A local maximum likelihood estimator based on Poisson regression is presented as well as its bias, variance and asymptotic
distribution. This semiparametric estimator is intended to be an alternative to the Poisson, negative binomial and zero-inflated
Poisson regression models that does not depend on regularity conditions and model specification accuracy. Some simulation
results are presented. The use of the local maximum likelihood procedure is illustrated on one example from the literature.
This procedure is found to perform well.
This research was partially supported by Calouste Gulbenkian Foundation and PRODEP III. 相似文献
5.
Longitudinal data sets with the structure T (time points) × N (subjects) are often incomplete because of data missing for certain subjects at certain time points. The EM algorithm is applied in conjunction with the Kalman smoother for computing maximum likelihood estimates of longitudinal LISREL models from varying missing data patterns. The iterative procedure uses the LISREL program in the M-step and the Kalman smoother in the E-step. The application of the method is illustrated by simulating missing data on a data set from educational research. 相似文献
6.
Extensions of the Cox proportional hazards model for survival data are studied where allowance is made for unobserved heterogeneity and for correlation between the life times of several individuals. The extended models are frailty models inspired by Y ashin et al. (1995). Estimation is carried out using the EM algorithm. Inference is discussed and potential applications are outlined, in particular to statistical research in human genetics using twin data or adoption data, aimed at separating the effects of genetic and environmental factors on mortality. 相似文献
7.
A frequently occurring problem is to find the maximum likelihood estimation (MLE) of p subject to p∈C (C⊂ P the probability vectors in R
k
). The problem has been discussed by many authors and they mainly focused when p is restricted by linear constraints or log-linear constraints. In this paper, we construct the relationship between the the
maximum likelihood estimation of p restricted by p∈C and EM algorithm and demonstrate that the maximum likelihood estimator can be computed through the EM algorithm (Dempster
et al. in J R Stat Soc Ser B 39:1–38, 1997). Several examples are analyzed by the proposed method. 相似文献
8.
On the analysis of multivariate growth curves 总被引:1,自引:0,他引:1
Growth curve data arise when repeated measurements are observed on a number of individuals with an ordered dimension for occasions. Such data appear frequently in almost all fields in which statistical models are used, for instance in medicine, agriculture and engineering. In medicine, for example, more than one variable is often measured on each occasion. However, analyses are usually based on exploration of repeated measurements of only one variable. The consequence is that the information contained in the between-variables correlation structure will be discarded. In this study we propose a multivariate model based on the random coefficient regression model for the analysis of growth curve data. Closed-form expressions for the model parameters are derived under the maximum likelihood (ML) and the restricted maximum likelihood (REML) framework. It is shown that in certain situations estimated variances of growth curve parameters are greater for REML. Also a method is proposed for testing general linear hypotheses. One numerical example is provided to illustrate the methods discussed. Received: 22 February 1999 相似文献
9.
Iain L. MacDonald 《Revue internationale de statistique》2014,82(2):296-308
There is by now a long tradition of using the EM algorithm to find maximum‐likelihood estimates (MLEs) when the data are incomplete in any of a wide range of ways, even when the observed‐data likelihood can easily be evaluated and numerical maximisation of that likelihood is available as a conceptually simple route to the MLEs. It is rare in the literature to see numerical maximisation employed if EM is possible. But with excellent general‐purpose numerical optimisers now available free, there is no longer any reason, as a matter of course, to avoid direct numerical maximisation of likelihood. In this tutorial, I present seven examples of models in which numerical maximisation of likelihood appears to have some advantages over the use of EM as a route to MLEs. The mathematical and coding effort is minimal, as there is no need to derive and code the E and M steps, only a likelihood evaluator. In all the examples, the unconstrained optimiser nlm available in R is used, and transformations are used to impose constraints on parameters. I suggest therefore that the following question be asked of proposed new applications of EM: Can the MLEs be found more simply and directly by using a general‐purpose numerical optimiser? 相似文献
10.
Zhiwei Zhang 《Revue internationale de statistique》2010,78(1):102-116
According to the law of likelihood, statistical evidence is represented by likelihood functions and its strength measured by likelihood ratios. This point of view has led to a likelihood paradigm for interpreting statistical evidence, which carefully distinguishes evidence about a parameter from error probabilities and personal belief. Like other paradigms of statistics, the likelihood paradigm faces challenges when data are observed incompletely, due to non-response or censoring, for instance. Standard methods to generate likelihood functions in such circumstances generally require assumptions about the mechanism that governs the incomplete observation of data, assumptions that usually rely on external information and cannot be validated with the observed data. Without reliable external information, the use of untestable assumptions driven by convenience could potentially compromise the interpretability of the resulting likelihood as an objective representation of the observed evidence. This paper proposes a profile likelihood approach for representing and interpreting statistical evidence with incomplete data without imposing untestable assumptions. The proposed approach is based on partial identification and is illustrated with several statistical problems involving missing data or censored data. Numerical examples based on real data are presented to demonstrate the feasibility of the approach. 相似文献
11.
Eugene Demidenko 《Revue internationale de statistique》2007,75(1):96-113
We compare five methods for parameter estimation of a Poisson regression model for clustered data: (1) ordinary (naive) Poisson regression (OP), which ignores intracluster correlation, (2) Poisson regression with fixed cluster‐specific intercepts (FI), (3) a generalized estimating equations (GEE) approach with an equi‐correlation matrix, (4) an exact generalized estimating equations (EGEE) approach with an exact covariance matrix, and (5) maximum likelihood (ML). Special attention is given to the simplest case of the Poisson regression with a cluster‐specific intercept random when the asymptotic covariance matrix is obtained in closed form. We prove that methods 1–5, except GEE, produce the same estimates of slope coefficients for balanced data (an equal number of observations in each cluster and the same vectors of covariates). All five methods lead to consistent estimates of slopes but have different efficiency for unbalanced data design. It is shown that the FI approach can be derived as a limiting case of maximum likelihood when the cluster variance increases to infinity. Exact asymptotic covariance matrices are derived for each method. In terms of asymptotic efficiency, the methods split into two groups: OP & GEE and EGEE & FI & ML. Thus, contrary to the existing practice, there is no advantage in using GEE because it is substantially outperformed by EGEE and FI. In particular, EGEE does not require integration and is easy to compute with the asymptotic variances of the slope estimates close to those of the ML. 相似文献
12.
Typically, a Poisson model is assumed for count data. In many cases, there are many zeros in the dependent variable, thus the mean is not equal to the variance value of the dependent variable. Therefore, Poisson model is not suitable anymore for this kind of data because of too many zeros. Thus, we suggest using a hurdle‐generalized Poisson regression model. Furthermore, the response variable in such cases is censored for some values because of some big values. A censored hurdle‐generalized Poisson regression model is introduced on count data with many zeros in this paper. The estimation of regression parameters using the maximum likelihood method is discussed and the goodness‐of‐fit for the regression model is examined. An example and a simulation will be used to illustrate the effects of right censoring on the parameter estimation and their standard errors. 相似文献
13.
In this article, we propose a mean linear regression model where the response variable is inverse gamma distributed using a new parameterization of this distribution that is indexed by mean and precision parameters. The main advantage of our new parametrization is the straightforward interpretation of the regression coefficients in terms of the expectation of the positive response variable, as usual in the context of generalized linear models. The variance function of the proposed model has a quadratic form. The inverse gamma distribution is a member of the exponential family of distributions and has some distributions commonly used for parametric models in survival analysis as special cases. We compare the proposed model to several alternatives and illustrate its advantages and usefulness. With a generalized linear model approach that takes advantage of exponential family properties, we discuss model estimation (by maximum likelihood), black further inferential quantities and diagnostic tools. A Monte Carlo experiment is conducted to evaluate the performances of these estimators in finite samples with a discussion of the obtained results. A real application using minerals data set collected by Department of Mines of the University of Atacama, Chile, is considered to demonstrate the practical potential of the proposed model. 相似文献
14.
Tobias Rydén 《Metrika》1998,47(1):119-145
For a recursive maximum-likelihood estimator with step lengths decaying as 1/n, an adaptive matrix needs to be incorporated to obtain asymptotic efficiency. Ideally, this matrix should be chosen as the inverse Fisher information matrix, which is usually very difficult to compute for incomplete data models. In this paper we give conditions under which the observed information can be incorporated into the recursive procedure to yield an efficient estimator, and we also investigate the finite sample properties of these estimators by simulation. 相似文献
15.
In recent years, we have seen an increased interest in the penalized likelihood methodology, which can be efficiently used for shrinkage and selection purposes. This strategy can also result in unbiased, sparse, and continuous estimators. However, the performance of the penalized likelihood approach depends on the proper choice of the regularization parameter. Therefore, it is important to select it appropriately. To this end, the generalized cross‐validation method is commonly used. In this article, we firstly propose new estimates of the norm of the error in the generalized linear models framework, through the use of Kantorovich inequalities. Then these estimates are used in order to derive a tuning parameter selector in penalized generalized linear models. The proposed method does not depend on resampling as the standard methods and therefore results in a considerable gain in computational time while producing improved results. A thorough simulation study is conducted to support theoretical findings; and a comparison of the penalized methods with the L1, the hard thresholding, and the smoothly clipped absolute deviation penalty functions is performed, for the cases of penalized Logistic regression and penalized Poisson regression. A real data example is being analyzed, and a discussion follows. © 2014 The Authors. Statistica Neerlandica © 2014 VVS. 相似文献
16.
《Spatial Economic Analysis》2013,8(4):467-483
Abstract This paper discusses the maximum likelihood estimator of a general unbalanced spatial random effects model with normal disturbances, assuming that some observations are missing at random. Monte Carlo simulations show that the maximum likelihood estimator for unbalanced panels performs well and that missing observations affect mainly the root mean square error. As expected, these estimates are less efficient than those based on the unobserved balanced model, especially if the share of missing observations is large or spatial autocorrelation in the error terms is pronounced. Estimation de vraisemblance maximale d'un modèle général d'effets aléatoires spatiaux déséquilibré: une étude Monte Carlo RÉSUMÉ La présente communication se penche sur l'estimateur du maximum de vraisemblance d'un modèle général d'effets aléatoires spatiaux déséquilibré avec des perturbations normales, en supposant l'absence aléatoire de certaines observations. Des simulations de Monte Carlo montrent que des groupes déséquilibrés se comporte bien, et que les observations manquantes affectent principalement l'erreur de la moyenne quadratique. Comme prévu, ces évaluations sont moins efficaces que celles qui sont basées sur le modèle équilibré non observé, notamment si la part des observations manquantes est importantes, ou l'on déclare une autocorrélation spatiale dans les termes d'erreur. Estimación de la probabilidad máxima de un modelo espacial general desequilibrado de efectos al azar: un estudio de Monte Carlo RÉSUMÉN Este trabajo discute el estimador de probabilidad máxima de un modelo espacial general desequilibrado de efectos al azar con alteraciones normales, suponiendo que faltan algunas observaciones al azar. Las simulaciones de Monte Carlo muestran que el estimador de probabilidad máxima para los paneles desequilibrados funciona satisfactoriamente, y que las observaciones omisas afectan principalmente al error de la media cuadrática. Como se suponía, estas estimaciones son menos eficientes que las basadas en el modelo equilibrado inadvertido, especialmente si la cantidad de omisiones es grande/o la autocorrelación en los términos de error es pronunciada. 相似文献
17.
The methodologies that have been used in existing research to assess the efficiency with which organic farms are operating
are generally based either on the stochastic frontier methodology or on a deterministic non-parametric approach. Recently,
Kumbhakar et al. (J Econom 137:1–27, 2007) proposed a new nonparametric, stochastic method based on the local maximum likelihood
principle. We use this methodology to compare the efficiency ratings of organic and conventional arable crop farms in the
Spanish region of Andalucía. Nonparametrically encompassing the stochastic frontier model is especially useful when comparing
the performance of two groups that are likely to be characterized by different production technologies.
相似文献
Teresa SerraEmail: Email: |
18.
The Invariant Quadratic Estimators, the Maximum Likelihood Estimator (MLE) and Restricted Maximum Likelihood Estimator (REML) of variances in an orthogonal Finite Discrete Spectrum Linear Regression Model (FDSLRM) are derived and the problems of unbiasedness and consistency of these estimators are investigated.Acknowledgement. The research was supported by the grants 1/0272/03, 1/0264/03 and 2/4026/04 of the Slovak Scientific Grant Agency VEGA. 相似文献
19.
Julia Plass Marco E.G.V. Cattaneo Thomas Augustin Georg Schollmeyer Christian Heumann 《Revue internationale de statistique》2019,87(3):580-603
In most surveys, one is confronted with missing or, more generally, coarse data. Traditional methods dealing with these data require strong, untestable and often doubtful assumptions, for example, coarsening at random. But due to the resulting, potentially severe bias, there is a growing interest in approaches that only include tenable knowledge about the coarsening process, leading to imprecise but reliable results. In this spirit, we study regression analysis with a coarse categorical‐dependent variable and precisely observed categorical covariates. Our (profile) likelihood‐based approach can incorporate weak knowledge about the coarsening process and thus offers a synthesis of traditional methods and cautious strategies refraining from any coarsening assumptions. This also allows a discussion of the uncertainty about the coarsening process, besides sampling uncertainty and model uncertainty. Our procedure is illustrated with data of the panel study ‘Labour market and social security' conducted by the Institute for Employment Research, whose questionnaire design produces coarse data. 相似文献
20.
Mixture regression models have been widely used in business, marketing and social sciences to model mixed regression relationships arising from a clustered and thus heterogeneous population. The unknown mixture regression parameters are usually estimated by maximum likelihood estimators using the expectation–maximisation algorithm based on the normality assumption of component error density. However, it is well known that the normality-based maximum likelihood estimation is very sensitive to outliers or heavy-tailed error distributions. This paper aims to give a selective overview of the recently proposed robust mixture regression methods and compare their performance using simulation studies. 相似文献