首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 343 毫秒
1.
Standard model‐based small area estimates perform poorly in presence of outliers. Sinha & Rao ( 2009 ) developed robust frequentist predictors of small area means. In this article, we present a robust Bayesian method to handle outliers in unit‐level data by extending the nested error regression model. We consider a finite mixture of normal distributions for the unit‐level error to model outliers and produce noninformative Bayes predictors of small area means. Our modelling approach generalises that of Datta & Ghosh ( 1991 ) under the normality assumption. Application of our method to a data set which is suspected to contain an outlier confirms this suspicion, correctly identifies the suspected outlier and produces robust predictors and posterior standard deviations of the small area means. Evaluation of several procedures including the M‐quantile method of Chambers & Tzavidis ( 2006 ) via simulations shows that our proposed method is as good as other procedures in terms of bias, variability and coverage probability of confidence and credible intervals when there are no outliers. In the presence of outliers, while our method and Sinha–Rao method perform similarly, they improve over the other methods. This superior performance of our procedure shows its dual (Bayes and frequentist) dominance, which should make it attractive to all practitioners, Bayesians and frequentists, of small area estimation.  相似文献   

2.
Sample surveys are widely used to obtain information about totals, means, medians and other parameters of finite populations. In many applications, similar information is desired for subpopulations such as individuals in specific geographic areas and socio‐demographic groups. When the surveys are conducted at national or similarly high levels, a probability sampling can result in just a few sampling units from many unplanned subpopulations at the design stage. Cost considerations may also lead to low sample sizes from individual small areas. Estimating the parameters of these subpopulations with satisfactory precision and evaluating their accuracy are serious challenges for statisticians. To overcome the difficulties, statisticians resort to pooling information across the small areas via suitable model assumptions, administrative archives and census data. In this paper, we develop an array of small area quantile estimators. The novelty is the introduction of a semiparametric density ratio model for the error distribution in the unit‐level nested error regression model. In contrast, the existing methods are usually most effective when the response values are jointly normal. We also propose a resampling procedure for estimating the mean square errors of these estimators. Simulation results indicate that the new methods have superior performance when the population distributions are skewed and remain competitive otherwise.  相似文献   

3.
Small area estimation is a widely used indirect estimation technique for micro‐level geographic profiling. Three unit level small area estimation techniques—the ELL or World Bank method, empirical best prediction (EBP) and M‐quantile (MQ) — can estimate micro‐level Foster, Greer, & Thorbecke (FGT) indicators: poverty incidence, gap and severity using both unit level survey and census data. However, they use different assumptions. The effects of using model‐based unit level census data reconstructed from cross‐tabulations and having no cluster level contextual variables for models are discussed, as are effects of small area and cluster level heterogeneity. A simulation‐based comparison of ELL, EBP and MQ uses a model‐based reconstruction of 2000/2001 data from Bangladesh and compares bias and mean square error. A three‐level ELL method is applied for comparison with the standard two‐level ELL that lacks a small area level component. An important finding is that the larger number of small areas for which ELL has been able to produce sufficiently accurate estimates in comparison with EBP and MQ has been driven more by the type of census data available or utilised than by the model per se.  相似文献   

4.
Small area estimation typically requires model‐based methods that depend on isolating the contribution to overall population heterogeneity associated with group (i.e. small area) membership. One way of doing this is via random effects models with latent group effects. Alternatively, one can use an M‐quantile ensemble model that assigns indices to sampled individuals characterising their contribution to overall sample heterogeneity. These indices are then aggregated to form group effects. The aim of this article is to contrast these two approaches to characterising group effects and to illustrate them in the context of small area estimation. In doing so, we consider a range of different data types, including continuous data, count data and binary response data.  相似文献   

5.
The effective use of spatial information in a regression‐based approach to small area estimation is an important practical issue. One approach to account for geographic information is by extending the linear mixed model to allow for spatially correlated random area effects. An alternative is to include the spatial information by a non‐parametric mixed models. Another option is geographic weighted regression where the model coefficients vary spatially across the geography of interest. Although these approaches are useful for estimating small area means efficiently under strict parametric assumptions, they can be sensitive to outliers. In this paper, we propose robust extensions of the geographically weighted empirical best linear unbiased predictor. In particular, we introduce robust projective and predictive estimators under spatial non‐stationarity. Mean squared error estimation is performed by two analytic approaches that account for the spatial structure in the data. Model‐based simulations show that the methodology proposed often leads to more efficient estimators. Furthermore, the analytic mean squared error estimators introduced have appealing properties in terms of stability and bias. Finally, we demonstrate in the application that the new methodology is a good choice for producing estimates for average rent prices of apartments in urban planning areas in Berlin.  相似文献   

6.
The proportional odds model is the most widely used model when the response has ordered categories. In the case of high‐dimensional predictor structure, the common maximum likelihood approach typically fails when all predictors are included. A boosting technique pomBoost is proposed to fit the model by implicitly selecting the influential predictors. The approach distinguishes between metric and categorical predictors. In the case of categorical predictors, where each predictor relates to a set of parameters, the objective is to select simultaneously all the associated parameters. In addition, the approach distinguishes between nominal and ordinal predictors. In the case of ordinal predictors, the proposed technique uses the ordering of the ordinal predictors by penalizing the difference between the parameters of adjacent categories. The technique has also a provision to consider some mandatory predictors (if any) that must be part of the final sparse model. The performance of the proposed boosting algorithm is evaluated in a simulation study and applications with respect to mean squared error and prediction error. Hit rates and false alarm rates are used to judge the performance of pomBoost for selection of the relevant predictors.  相似文献   

7.
A new method, called Relevant Transformation of the Inputs Network Approach is proposed as a tool for model building. It is designed around flexibility (with nonlinear transformations of the predictors of interest), selective search within the range of possible models, out‐of‐sample forecasting ability and computational simplicity. In tests on simulated data, it shows both a high rate of successful retrieval of the data generating process, which increases with the sample size and a good performance relative to other alternative procedures. A telephone service demand model is built to show how the procedure applies on real data.  相似文献   

8.
Without accounting for sensitive items in sample surveys, sampled units may not respond (nonignorable nonresponse) or they respond untruthfully. There are several survey designs that address this problem and we will review some of them. In our study, we have binary data from clusters within small areas, obtained from a version of the unrelated‐question design, and the sensitive proportion is of interest for each area. A hierarchical Bayesian model is used to capture the variation in the observed binomial counts from the clusters within the small areas and to estimate the sensitive proportions for all areas. Both our example on college cheating and a simulation study show significant reductions in the posterior standard deviations of the sensitive proportions under the small‐area model as compared with an analogous individual‐area model. The simulation study also demonstrates that the estimates under the small‐area model are closer to the truth than for the corresponding estimates under the individual‐area model. Finally, for small areas, we discuss many extensions to accommodate covariates, finite population sampling, multiple sensitive items and optional designs.  相似文献   

9.
An important application of multiple regression is predictor selection. When there are no missing values in the data, information criteria can be used to select predictors. For example, one could apply the small‐sample‐size corrected version of the Akaike information criterion (AIC), the (AICC). In this article, we discuss how information criteria should be calculated when the dependent variable and/or the predictors contain missing values. Therewith, we extensively discuss and evaluate three models that can be employed to deal with the missing data, that is, to predict the missing values. The most complex model, that is, the model with all available predictors, outperforms the other models. These results also apply to more general hypotheses than predictor selection and also to structural equation modeling (SEM) models.  相似文献   

10.
In this paper, we modify small area estimators, based on the unit‐level model, so that they add up to reliable higher‐level estimates of population totals. These modifications result in benchmarked small area estimators. We consider two benchmarking procedures. One is based on augmenting the unit‐level model with a suitable variable. The other one uses the calibrated weights of the direct estimators that are reliable at the higher levels. These weights are used in estimators that are based on the aggregation of the unit‐level model for each small area. The mean squared error estimators of the proposed benchmarked estimators are obtained by suitably modifying those associated with the corresponding non benchmarked estimators. The properties of the estimators are evaluated via simulation.  相似文献   

11.
This paper deals with the estimation of the mean of a spatial population. Under a design‐based approach to inference, an estimator assisted by a penalized spline regression model is proposed and studied. Proof that the estimator is design‐consistent and has a normal limiting distribution is provided. A simulation study is carried out to investigate the performance of the new estimator and its variance estimator, in terms of relative bias, efficiency, and confidence interval coverage rate. The results show that gains in efficiency over standard estimators in classical sampling theory may be impressive.  相似文献   

12.
Surveys usually include questions where individuals must select one in a series of possible options that can be sorted. On the other hand, multiple frame surveys are becoming a widely used method to decrease bias due to undercoverage of the target population. In this work, we propose statistical techniques for handling ordinal data coming from a multiple frame survey using complex sampling designs and auxiliary information. Our aim is to estimate proportions when the variable of interest has ordinal outcomes. Two estimators are constructed following model‐assisted generalised regression and model calibration techniques. Theoretical properties are investigated for these estimators. Simulation studies with different sampling procedures are considered to evaluate the performance of the proposed estimators in finite size samples. An application to a real survey on opinions towards immigration is also included.  相似文献   

13.
The authors consider the problem of estimating a conditional density by a conditional kernel density estimate when the error associated with the estimate is measured by the L1‐norm. On the basis of the combinatorial method of Devroye and Lugosi ( 1996 ), they propose a method for selecting the bandwidths adaptively and for providing a theoretical justification of the approach. They use simulated data to illustrate the finite‐sample performance of their estimator.  相似文献   

14.
Social and economic studies are often implemented as complex survey designs. For example, multistage, unequal probability sampling designs utilised by federal statistical agencies are typically constructed to maximise the efficiency of the target domain level estimator (e.g. indexed by geographic area) within cost constraints for survey administration. Such designs may induce dependence between the sampled units; for example, with employment of a sampling step that selects geographically indexed clusters of units. A sampling‐weighted pseudo‐posterior distribution may be used to estimate the population model on the observed sample. The dependence induced between coclustered units inflates the scale of the resulting pseudo‐posterior covariance matrix that has been shown to induce under coverage of the credibility sets. By bridging results across Bayesian model misspecification and survey sampling, we demonstrate that the scale and shape of the asymptotic distributions are different between each of the pseudo‐maximum likelihood estimate (MLE), the pseudo‐posterior and the MLE under simple random sampling. Through insights from survey‐sampling variance estimation and recent advances in computational methods, we devise a correction applied as a simple and fast postprocessing step to Markov chain Monte Carlo draws of the pseudo‐posterior distribution. This adjustment projects the pseudo‐posterior covariance matrix such that the nominal coverage is approximately achieved. We make an application to the National Survey on Drug Use and Health as a motivating example and we demonstrate the efficacy of our scale and shape projection procedure on synthetic data on several common archetypes of survey designs.  相似文献   

15.
We review three alternative approaches to modelling survey non‐contact and refusal: multinomial, sequential, and sample selection (bivariate probit) models. We then propose a multilevel extension of the sample selection model to allow for both interviewer effects and dependency between non‐contact and refusal rates at the household and interviewer level. All methods are applied and compared in an analysis of household non‐response in the United Kingdom, using a data set with unusually rich information on both respondents and non‐respondents from six major surveys. After controlling for household characteristics, there is little evidence of residual correlation between the unobserved characteristics affecting non‐contact and refusal propensities at either the household or the interviewer level. We also find that the estimated coefficients of the multinomial and sequential models are surprisingly similar, which further investigation via a simulation study suggests is due to non‐contact and refusal having largely different predictors.  相似文献   

16.
This paper proposes a semiparametric method to control for ability using standardized test scores, or other item response assessments, in a regression model. The proposed method is based on a model in which the parameter of interest is invariant to monotonic transformations of ability. I show that the estimator is consistent as both the number of observations and the number of items on the test grow to infinity. I also derive conditions under which this estimator is root‐n consistent and asymptotically normal. The proposed method is easy to implement, does not impose a parametric item response model, and does not require item‐level data. I demonstrate the finite‐sample performance in a Monte Carlo study and implement the procedure for a wage regression using data from the NLSY1979.  相似文献   

17.
A desirable property of a forecast is that it encompasses competing predictions, in the sense that the accuracy of the preferred forecast cannot be improved through linear combination with a rival prediction. In this paper, we investigate the impact of the uncertainty associated with estimating model parameters in‐sample on the encompassing properties of out‐of‐sample forecasts. Specifically, using examples of non‐nested econometric models, we show that forecasts from the true (but estimated) data generating process (DGP) do not encompass forecasts from competing mis‐specified models in general, particularly when the number of in‐sample observations is small. Following this result, we also examine the scope for achieving gains in accuracy by combining the forecasts from the DGP and mis‐specified models.  相似文献   

18.
In dynamic panel regression, when the variance ratio of individual effects to disturbance is large, the system‐GMM estimator will have large asymptotic variance and poor finite sample performance. To deal with this variance ratio problem, we propose a residual‐based instrumental variables (RIV) estimator, which uses the residual from regressing Δyi,t?1 on as the instrument for the level equation. The RIV estimator proposed is consistent and asymptotically normal under general assumptions. More importantly, its asymptotic variance is almost unaffected by the variance ratio of individual effects to disturbance. Monte Carlo simulations show that the RIV estimator has better finite sample performance compared to alternative estimators. The RIV estimator generates less finite sample bias than difference‐GMM, system‐GMM, collapsing‐GMM and Level‐IV estimators in most cases. Under RIV estimation, the variance ratio problem is well controlled, and the empirical distribution of its t‐statistic is similar to the standard normal distribution for moderate sample sizes.  相似文献   

19.
We analyse the finite sample properties of maximum likelihood estimators for dynamic panel data models. In particular, we consider transformed maximum likelihood (TML) and random effects maximum likelihood (RML) estimation. We show that TML and RML estimators are solutions to a cubic first‐order condition in the autoregressive parameter. Furthermore, in finite samples both likelihood estimators might lead to a negative estimate of the variance of the individual‐specific effects. We consider different approaches taking into account the non‐negativity restriction for the variance. We show that these approaches may lead to a solution different from the unique global unconstrained maximum. In an extensive Monte Carlo study we find that this issue is non‐negligible for small values of T and that different approaches might lead to different finite sample properties. Furthermore, we find that the Likelihood Ratio statistic provides size control in small samples, albeit with low power due to the flatness of the log‐likelihood function. We illustrate these issues modelling US state level unemployment dynamics.  相似文献   

20.
Parameter estimation based on the generalized method of moments (GMM) is proposed. The proposed method employs a distance between an empirical and the corresponding theoretical transform. Estimation by the empirical characteristic function (CF) is a typical example, but alternative empirical transforms are also employed, such as the empirical Laplace transform when dealing with non‐negative random variables. D‐optimal designs are discussed, whereby the arguments of the empirical transform are chosen by maximizing the determinant of the asymptotic Fisher information matrix for the resulting estimators. The methods are applied to some parametric models for which classical inference is complicated.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号