首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
This paper develops a semi-parametric estimation method for hurdle (two-part) count regression models. The approach in each stage is based on Laguerre series expansion for the unknown density of the unobserved heterogeneity. The semi-parametric hurdle model nests Poisson and negative binomial hurdle models, which have been used in recent applied literature. The empirical part of the paper evaluates the impact of managed care programmes for Medicaid eligibles on utilization of health-care services using a key utilization variable, the number of doctor and health centre visits. Health status measures and age seem to be more important in determining health-care utilization than other socio-economic and enrollment variables. The semi-parametric approach is particularly useful for the analysis of overdispersed individual level data characterized by a large proportion of non-users, and highly skewed distribution of counts for users. © 1997 John Wiley & Sons, Ltd.  相似文献   

2.
For Poisson inverse Gaussian regression models, it is very complicated to obtain the influence measures based on the traditional method, because the associated likelihood function involves intractable expressions, such as the modified Bessel function. In this paper, the EM algorithm is employed as a basis to derive diagnostic measures for the models by treating them as a mixed Poisson regression with the weights from the inverse Gaussian distributions. Several diagnostic measures are obtained in both case-deletion model and local influence analysis, based on the conditional expectation of the complete-data log-likelihood function in the EM algorithm. Two numerical examples are given to illustrate the results.  相似文献   

3.
This paper demonstrates that the unobserved heterogeneity commonly assumed to be the source of overdispersion in count data models has predictable implications for the probability structure of such mixture models. In particular, the common observation of excess zeros is a strict implication of unobserved heterogeneity. This result has important implications for using count model estimates for predicting certain interesting parameters. Test statistics to detect such heterogeneity-related departures from the null model are proposed and applied in a health-care utilization example, suggesting that a null Poisson model should be rejected in favour of a mixed alternative. © 1997 John Wiley & Sons, Ltd.  相似文献   

4.
A local maximum likelihood estimator based on Poisson regression is presented as well as its bias, variance and asymptotic distribution. This semiparametric estimator is intended to be an alternative to the Poisson, negative binomial and zero-inflated Poisson regression models that does not depend on regularity conditions and model specification accuracy. Some simulation results are presented. The use of the local maximum likelihood procedure is illustrated on one example from the literature. This procedure is found to perform well. This research was partially supported by Calouste Gulbenkian Foundation and PRODEP III.  相似文献   

5.
We review generalized dynamic models for time series of count data. Usually temporal counts are modelled as following a Poisson distribution, and a transformation of the mean depends on parameters which evolve smoothly with time. We generalize the usual dynamic Poisson model by considering continuous mixtures of the Poisson distribution. We consider Poisson‐gamma and Poisson‐log‐normal mixture models. These models have a parameter for each time t which captures possible extra‐variation present in the data. If the time interval between observations is short, many observed zeros might result. We also propose zero inflated versions of the models mentioned above. In epidemiology, when a count is equal to zero, one does not know if the disease is present or not. Our model has a parameter which provides the probability of presence of the disease given no cases were observed. We rely on the Bayesian paradigm to obtain estimates of the parameters of interest, and discuss numerical methods to obtain samples from the resultant posterior distribution. We fit the proposed models to artificial data sets and also to a weekly time series of registered number of cases of dengue fever in a district of the city of Rio de Janeiro, Brazil, during 2001 and 2002.  相似文献   

6.
《Socio》1999,33(1):39-59
This paper aims to demonstrate the importance of mixture distribution modelling in analysing the characteristics of inpatient length of stay (LOS), which has direct implications on health planning and formation of payment policy. It is found that mixture distribution analysis can confirm the homogeneity of certain Diagnosis Related Groups (DRGs). It can also reveal the heterogeneous patterns of other DRGs. For those DRGs exhibiting heterogeneity in LOS, related socio-economic factors influencing LOS are compared and contrasted between components by Poisson mixture regressions. Such an analysis provides an integrated framework to link funding with relevant influencing factors of LOS. A Poisson mixture regression model can give useful insights for state health institutions to initiate efficient casemix payments. It also benefits hospital managers and clinicians to manage LOS more effectively.  相似文献   

7.
8.
Fifty years have passed since the publication of the first regression tree algorithm. New techniques have added capabilities that far surpass those of the early methods. Modern classification trees can partition the data with linear splits on subsets of variables and fit nearest neighbor, kernel density, and other models in the partitions. Regression trees can fit almost every kind of traditional statistical model, including least‐squares, quantile, logistic, Poisson, and proportional hazards models, as well as models for longitudinal and multiresponse data. Greater availability and affordability of software (much of which is free) have played a significant role in helping the techniques gain acceptance and popularity in the broader scientific community. This article surveys the developments and briefly reviews the key ideas behind some of the major algorithms.  相似文献   

9.
In this paper, we introduce a new Poisson mixture model for count panel data where the underlying Poisson process intensity is determined endogenously by consumer latent utility maximization over a set of choice alternatives. This formulation accommodates the choice and count in a single random utility framework with desirable theoretical properties. Individual heterogeneity is introduced through a random coefficient scheme with a flexible semiparametric distribution. We deal with the analytical intractability of the resulting mixture by recasting the model as an embedding of infinite sequences of scaled moments of the mixing distribution, and newly derive their cumulant representations along with bounds on their rate of numerical convergence. We further develop an efficient recursive algorithm for fast evaluation of the model likelihood within a Bayesian Gibbs sampling scheme. We apply our model to a recent household panel of supermarket visit counts. We estimate the nonparametric density of three key variables of interest-price, driving distance, and their interaction-while controlling for a range of consumer demographic characteristics. We use this econometric framework to assess the opportunity cost of time and analyze the interaction between store choice, trip frequency, search intensity, and household and store characteristics. We also conduct a counterfactual welfare experiment and compute the compensating variation for a 10%-30% increase in Walmart prices.  相似文献   

10.
Typically, a Poisson model is assumed for count data. In many cases, there are many zeros in the dependent variable, thus the mean is not equal to the variance value of the dependent variable. Therefore, Poisson model is not suitable anymore for this kind of data because of too many zeros. Thus, we suggest using a hurdle‐generalized Poisson regression model. Furthermore, the response variable in such cases is censored for some values because of some big values. A censored hurdle‐generalized Poisson regression model is introduced on count data with many zeros in this paper. The estimation of regression parameters using the maximum likelihood method is discussed and the goodness‐of‐fit for the regression model is examined. An example and a simulation will be used to illustrate the effects of right censoring on the parameter estimation and their standard errors.  相似文献   

11.
A statistical test for the degree of overdispersion of count data time series based on the empirical version of the (Poisson) index of dispersion is considered. The test design relies on asymptotic properties of this index of dispersion, which in turn have been analyzed for time series stemming from a compound Poisson (Poisson‐stopped sum) INAR(1) model. This approach is extended to the popular Poisson INARCH(1) model, which exhibits unconditional overdispersion but has an (equidispersed) conditional Poisson distribution. The asymptotic distribution of the index of dispersion if applied to time series stemming from such a model is derived. These results allow us to investigate the ability of the dispersion test to discriminate between Poisson INAR(1) and INARCH(1) models. Furthermore, the question is considered if the index of dispersion could be used to test the null of a Poisson INARCH(1) model against the alternative of an INARCH(1) model with additional conditional overdispersion.  相似文献   

12.
This paper discusses the specification and estimation of seemingly unrelated multivariate count data models. A new model with negative binomial marginals is proposed. In contrast to a previous model based on the multivariate Poisson distribution, the new model allows for over-dispersion, a phenomenon that is frequently encountered in economic count data. Semi-parametric estimation is possible if some of the assumption of the fully specified model are violated.  相似文献   

13.
A bivariate exponentiated‐exponential geometric regression model that allows negative, zero, or positive correlation is defined and studied. The model can accommodate under‐ or over‐dispersed count data. The regression model is based on the univariate exponentiated‐exponential geometric distribution, and the marginal means of the bivariate model are functions of the explanatory variables. The parameters of the bivariate regression model are estimated by using the maximum likelihood method. Some test statistics including goodness of fit are discussed. A simulation study is conducted to compare the model with the bivariate generalized Poisson regression model. One numerical data set is used to illustrate the application of the regression model.  相似文献   

14.
The truncated Poisson regression model is used to arrive at point and interval estimates of the size of two offender populations, i.e. drunk drivers and persons who illegally possess firearms. The dependent capture–recapture variables are constructed from Dutch police records and are counts of individual arrests for both violations. The population size estimates are derived assuming that each count is a realization of a Poisson distribution, and that the Poisson parameters are related to covariates through the truncated Poisson regression model. These assumptions are discussed in detail, and the tenability of the second assumption is assessed by evaluating the marginal residuals and performing tests on overdispersion. For the firearms example, the second assumption seems to hold well, but for the drunk drivers example there is some overdispersion. It is concluded that the method is useful, provided it is used with care.  相似文献   

15.
We introduce several new sports team rating models based on the gradient descent algorithm. More precisely, the models can be formulated by maximising the likelihood of match results observed using a single step of this optimisation heuristic. The proposed framework is inspired by the prominent Elo rating system, and yields an iterative version of ordinal logistic regression, as well as different variants of Poisson regression-based models. This construction makes the update equations easy to interpret, and adjusts ratings once new match results are observed. Thus, it naturally handles temporal changes in team strength. Moreover, a study of association football data indicates that the new models yield more accurate forecasts and are less computationally demanding than corresponding methods that jointly optimise the likelihood for the whole set of matches.  相似文献   

16.
This paper studies the determinants of repeat visiting in Uruguay, where loyal visitors are a relevant part of the total. From a statistical point of view, the number of times a visitor has been to a place constitutes count data. In this regard available information on Uruguay presents relevant limitations. Count data is in fact reported only for those who visited the country up to five times, whereas records about the most frequent visitors are collapsed into one residual category. This implies that the classic models for count data such as Poisson or negative binomial cannot be put into consideration. The paper suggests instead modelling the available part of the empirical distribution through quantile count data regression. It is a model based on measures of location rather than mean values, which allows estimating tourists’ behaviour as the number of visits increases. A set of explanatory variables related to budgetary constraints, socioeconomic, trip-related and psychographic characteristics are taken as regressors to the considered count data.  相似文献   

17.
The generalized linear mixed model (GLMM) extends classical regression analysis to non-normal, correlated response data. Because inference for GLMMs can be computationally difficult, simplifying distributional assumptions are often made. We focus on the robustness of estimators when a main component of the model, the random effects distribution, is misspecified. Results for the maximum likelihood estimators of the Poisson inverse Gaussian model are presented.  相似文献   

18.
Data that have a multilevel structure occur frequently across a range of disciplines, including epidemiology, health services research, public health, education and sociology. We describe three families of regression models for the analysis of multilevel survival data. First, Cox proportional hazards models with mixed effects incorporate cluster‐specific random effects that modify the baseline hazard function. Second, piecewise exponential survival models partition the duration of follow‐up into mutually exclusive intervals and fit a model that assumes that the hazard function is constant within each interval. This is equivalent to a Poisson regression model that incorporates the duration of exposure within each interval. By incorporating cluster‐specific random effects, generalised linear mixed models can be used to analyse these data. Third, after partitioning the duration of follow‐up into mutually exclusive intervals, one can use discrete time survival models that use a complementary log–log generalised linear model to model the occurrence of the outcome of interest within each interval. Random effects can be incorporated to account for within‐cluster homogeneity in outcomes. We illustrate the application of these methods using data consisting of patients hospitalised with a heart attack. We illustrate the application of these methods using three statistical programming languages (R, SAS and Stata).  相似文献   

19.
In this paper the application of bivariate Poisson heterogeneous models to budget data is studied. This study was motivated from inconsistencies that we encountered when univariate Poisson based models were applied to cumulative data sets. Application of a multivariate Poisson based model is a possible solution to this problem. In this paper we will study the feasibility of estimators based on these models.  相似文献   

20.
Many statistical problems can be formulated as discrete missing data problems (MDPs). Examples include change-point problems, capture and recapture models, sample survey with non-response, zero-inflated Poisson models, medical screening/diagnostic tests and bioassay. This paper proposes an exact non-iterative sampling algorithm to obtain independently and identically distributed (i.i.d.) samples from posterior distribution in discrete MDPs. The new algorithm is essentially a conditional sampling, thus completely avoiding problems of convergence and slow convergence in iterative algorithms such as Markov chain Monte Carlo. Different from the general inverse Bayes formulae (IBF) sampler of Tan, Tian and Ng (Statistica Sinica, 13 , 2003, 625), the implementation of the new algorithm requires neither the expectation maximization nor the sampling importance resampling algorithms. The key idea is to first utilize the sampling-wise IBF to derive the conditional distribution of the missing data given the observed data, and then to draw i.i.d. samples from the complete-data posterior distribution. We first illustrate the method with a performing example and then apply the method to contingency tables with one supplemental margin for an human immunodeficiency virus study.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号