首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 28 毫秒
1.
Social and economic studies are often implemented as complex survey designs. For example, multistage, unequal probability sampling designs utilised by federal statistical agencies are typically constructed to maximise the efficiency of the target domain level estimator (e.g. indexed by geographic area) within cost constraints for survey administration. Such designs may induce dependence between the sampled units; for example, with employment of a sampling step that selects geographically indexed clusters of units. A sampling‐weighted pseudo‐posterior distribution may be used to estimate the population model on the observed sample. The dependence induced between coclustered units inflates the scale of the resulting pseudo‐posterior covariance matrix that has been shown to induce under coverage of the credibility sets. By bridging results across Bayesian model misspecification and survey sampling, we demonstrate that the scale and shape of the asymptotic distributions are different between each of the pseudo‐maximum likelihood estimate (MLE), the pseudo‐posterior and the MLE under simple random sampling. Through insights from survey‐sampling variance estimation and recent advances in computational methods, we devise a correction applied as a simple and fast postprocessing step to Markov chain Monte Carlo draws of the pseudo‐posterior distribution. This adjustment projects the pseudo‐posterior covariance matrix such that the nominal coverage is approximately achieved. We make an application to the National Survey on Drug Use and Health as a motivating example and we demonstrate the efficacy of our scale and shape projection procedure on synthetic data on several common archetypes of survey designs.  相似文献   

2.
The effective use of spatial information in a regression‐based approach to small area estimation is an important practical issue. One approach to account for geographic information is by extending the linear mixed model to allow for spatially correlated random area effects. An alternative is to include the spatial information by a non‐parametric mixed models. Another option is geographic weighted regression where the model coefficients vary spatially across the geography of interest. Although these approaches are useful for estimating small area means efficiently under strict parametric assumptions, they can be sensitive to outliers. In this paper, we propose robust extensions of the geographically weighted empirical best linear unbiased predictor. In particular, we introduce robust projective and predictive estimators under spatial non‐stationarity. Mean squared error estimation is performed by two analytic approaches that account for the spatial structure in the data. Model‐based simulations show that the methodology proposed often leads to more efficient estimators. Furthermore, the analytic mean squared error estimators introduced have appealing properties in terms of stability and bias. Finally, we demonstrate in the application that the new methodology is a good choice for producing estimates for average rent prices of apartments in urban planning areas in Berlin.  相似文献   

3.
When sensitive issues are surveyed, collecting truthful data and obtaining reliable estimates of population parameters is a persistent problem in many fields of applied research mostly in sociological, economic, demographic, ecological and medical studies. In this context, and moving from the so‐called negative survey, we consider the problem of estimating the proportion of population units belonging to the categories of a sensitive variable when collected data are affected by measurement errors produced by untruthful responses. An extension of the negative survey approach is proposed herein in order to allow respondents to release a true response. The proposal rests on modelling the released data with a mixture of truthful and untruthful responses that allows researchers to obtain an estimate of the proportions as well as the probability of receiving the true response by implementing the EM‐algorithm. We describe the estimation procedure and carry out a simulation study to assess the performance of the EM estimates vis‐à‐vis certain benchmark values and the estimates obtained under the traditional data‐collection approach based on direct questioning that ignores the presence of misreporting due to untruthful responding. Simulation findings provide evidence on the accuracy of the estimates and permit us to appreciate the improvements that our approach can produce in public surveys, particularly in election opinion polls, when the hidden vote problem is present.  相似文献   

4.
In this paper, we modify small area estimators, based on the unit‐level model, so that they add up to reliable higher‐level estimates of population totals. These modifications result in benchmarked small area estimators. We consider two benchmarking procedures. One is based on augmenting the unit‐level model with a suitable variable. The other one uses the calibrated weights of the direct estimators that are reliable at the higher levels. These weights are used in estimators that are based on the aggregation of the unit‐level model for each small area. The mean squared error estimators of the proposed benchmarked estimators are obtained by suitably modifying those associated with the corresponding non benchmarked estimators. The properties of the estimators are evaluated via simulation.  相似文献   

5.
Small area estimation is a widely used indirect estimation technique for micro‐level geographic profiling. Three unit level small area estimation techniques—the ELL or World Bank method, empirical best prediction (EBP) and M‐quantile (MQ) — can estimate micro‐level Foster, Greer, & Thorbecke (FGT) indicators: poverty incidence, gap and severity using both unit level survey and census data. However, they use different assumptions. The effects of using model‐based unit level census data reconstructed from cross‐tabulations and having no cluster level contextual variables for models are discussed, as are effects of small area and cluster level heterogeneity. A simulation‐based comparison of ELL, EBP and MQ uses a model‐based reconstruction of 2000/2001 data from Bangladesh and compares bias and mean square error. A three‐level ELL method is applied for comparison with the standard two‐level ELL that lacks a small area level component. An important finding is that the larger number of small areas for which ELL has been able to produce sufficiently accurate estimates in comparison with EBP and MQ has been driven more by the type of census data available or utilised than by the model per se.  相似文献   

6.
Standard model‐based small area estimates perform poorly in presence of outliers. Sinha & Rao ( 2009 ) developed robust frequentist predictors of small area means. In this article, we present a robust Bayesian method to handle outliers in unit‐level data by extending the nested error regression model. We consider a finite mixture of normal distributions for the unit‐level error to model outliers and produce noninformative Bayes predictors of small area means. Our modelling approach generalises that of Datta & Ghosh ( 1991 ) under the normality assumption. Application of our method to a data set which is suspected to contain an outlier confirms this suspicion, correctly identifies the suspected outlier and produces robust predictors and posterior standard deviations of the small area means. Evaluation of several procedures including the M‐quantile method of Chambers & Tzavidis ( 2006 ) via simulations shows that our proposed method is as good as other procedures in terms of bias, variability and coverage probability of confidence and credible intervals when there are no outliers. In the presence of outliers, while our method and Sinha–Rao method perform similarly, they improve over the other methods. This superior performance of our procedure shows its dual (Bayes and frequentist) dominance, which should make it attractive to all practitioners, Bayesians and frequentists, of small area estimation.  相似文献   

7.
Most empirical studies of individual migration choice analyse factors associated with out‐migration from an origin location. In contrast, we model the migration decision within the context of potential destinations, combining British panel data over the period 1992–2008 with other data sources. Contrary to earlier micro studies, we show that differences in house prices levels (but not growth) are important determinants of household migration for homeowners. Unemployed individuals respond to regional differences in expected individual wages, whereas the employed are more sensitive to employment opportunities. Our evidence is consistent with partners of heads of households being tied migrants.  相似文献   

8.
Deep and persistent disadvantage is an important, but statistically rare, phenomenon in the population, and sample sizes are usually not large enough to provide reliable estimates for disaggregated analysis. Survey samples are typically designed to produce estimates of population characteristics of planned areas. The sample sizes are calculated so that the survey estimator for each of the planned areas is of a desired level of precision. However, in many instances, estimators are required for areas of the population for which the survey providing the data was unplanned. Then, for areas with small sample sizes, direct estimation of population characteristics based only on the data available from the particular area tends to be unreliable. This has led to the development of a class of indirect estimators that make use of information from related areas through modelling. A model is used to link similar areas to enhance the estimation of unplanned areas; in other words, they borrow strength from the other areas. Doing so improves the precision of estimated characteristics in the small area, especially in areas with smaller sample sizes. Social science researchers have increasingly employed small area estimation to provide localised estimates of population characteristics from surveys. We explore how to extend this approach within the context of deep and persistent disadvantage in Australia. We find that because of the unique circumstances of the Australian population distribution, direct estimates of disadvantage have substantial variation, but by applying small area estimation, there are significant improvements in precision of estimates.  相似文献   

9.
Factor analysis models are used in data dimensionality reduction problems where the variability among observed variables can be described through a smaller number of unobserved latent variables. This approach is often used to estimate the multidimensionality of well-being. We employ factor analysis models and use multivariate empirical best linear unbiased predictor (EBLUP) under a unit-level small area estimation approach to predict a vector of means of factor scores representing well-being for small areas. We compare this approach with the standard approach whereby we use small area estimation (univariate and multivariate) to estimate a dashboard of EBLUPs of the means of the original variables and then averaged. Our simulation study shows that the use of factor scores provides estimates with lower variability than weighted and simple averages of standardised multivariate EBLUPs and univariate EBLUPs. Moreover, we find that when the correlation in the observed data is taken into account before small area estimates are computed, multivariate modelling does not provide large improvements in the precision of the estimates over the univariate modelling. We close with an application using the European Union Statistics on Income and Living Conditions data.  相似文献   

10.
Small area estimation is concerned with methodology for estimating population parameters associated with a geographic area defined by a cross-classification that may also include non-geographic dimensions. In this paper, we develop constrained estimation methods for small area problems: those requiring smoothness with respect to similarity across areas, such as geographic proximity or clustering by covariates, and benchmarking constraints, requiring weighted means of estimates to agree across levels of aggregation. We develop methods for constrained estimation decision theoretically and discuss their geometric interpretation. The constrained estimators are the solutions to tractable optimisation problems and have closed-form solutions. Mean squared errors of the constrained estimators are calculated via bootstrapping. Our approach assumes the Bayes estimator exists and is applicable to any proposed model. In addition, we give special cases of our techniques under certain distributional assumptions. We illustrate the proposed methodology using web-scraped data on Berlin rents aggregated over areas to ensure privacy.  相似文献   

11.
This paper presents a model for the heterogeneity and dynamics of the conditional mean and conditional variance of individual wages. A bias‐corrected likelihood approach, which reduces the estimation bias to a term of order 1/T2, is used for estimation and inference. The small‐sample performance of the proposed estimator is investigated in a Monte Carlo study. The simulation results show that the bias of the maximum likelihood estimator is substantially corrected for designs calibrated to the data used in the empirical analysis, drawn from the PSID. The empirical results show that it is important to account for individual unobserved heterogeneity and dynamics in the variance, and that the latter is driven by job mobility. The model also explains the non‐normality observed in log‐wage data. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

12.
We consider the recent novel two‐step estimator of Iaryczower and Shum (American Economic Review 2012; 102 : 202–237), who analyze voting decisions of US Supreme Court justices. Motivated by the underlying theoretical voting model, we suggest that where the data under consideration display variation in the common prior, estimates of the structural parameters based on their methodology should generally benefit from including interaction terms between individual and time covariates in the first stage whenever there is individual heterogeneity in expertise. We show numerically, via simulation and re‐estimation of the US Supreme Court data, that the first‐order interaction effects that appear in the theoretical model can have an important empirical implication. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

13.
In the areas of missing data and causal inference, there is great interest in doubly robust (DR) estimators that involve both an outcome regression (RG) model and a propensity score (PS) model. These DR estimators are consistent and asymptotically normal if either model is correctly specified. Despite their theoretical appeal, the practical utility of DR estimators has been disputed (e.g. Kang and Schaffer, Statistical Science 2007; 22: 523–539). One of the major concerns is the possibility of erratic estimates resulting from near‐zero denominators due to extreme values of the estimated PS. In contrast, the usual RG estimator based on the RG model alone is efficient when the RG model is correct and generally more stable than the DR estimators, although it can be biased when the RG model is incorrect. In light of the unique advantages of the RG and DR estimators, we propose a class of hybrid estimators that attempt to strike a reasonable balance between the RG and DR estimators. These hybrid estimators are motivated by heuristic arguments that coarsened PS estimates are less likely to take extreme values and less sensitive to misspecification of the PS model than the original model‐based PS estimates. The proposed estimators are compared with existing estimators in simulation studies and illustrated with real data from a large observational study on obstetric labour progression and birth outcomes.  相似文献   

14.
Previous empirical studies of individual union status in Britain have been cross‐sectional. In contrast, we use longitudinal data from the National Child Development Study, to estimate the determinants of male trade union membership over the period 1981–1991. As suggested by union theories, we find that it is important to control for unobserved individual heterogeneity, and our preferred model allows for correlation of individual heterogeneity with observable variables. Our estimates reveal that the observed decline in very large workplaces, and the contraction of the public sector, explain about one third of the predicted decline in union membership over the period. Copyright © 2000 John Wiley & Sons, Ltd.  相似文献   

15.
Better understanding respondents' cognitions as they respond to situational judgment test (SJT) items and isolating which elements of knowledge they measure may allow psychologists to develop more predictive SJT items with greater ease. Consequently, we present a theoretical framework outlining the thought processes individuals engage in as they respond to SJT items. We review interactionist theories explaining how these models have shaped the understanding of how personality traits affect behavior and discuss the recent scholarly debate regarding the role of the situation in SJTs. We then describe our proposed tripartite model of the psychological processes test takers may engage in as they respond to SJT items. Finally, we conclude by discussing directions for future research and potential avenues for expanding the proposed model.  相似文献   

16.
Copulas are distributions with uniform marginals. Non‐parametric copula estimates may violate the uniformity condition in finite samples. We look at whether it is possible to obtain valid piecewise linear copula densities by triangulation. The copula property imposes strict constraints on design points, making an equi‐spaced grid a natural starting point. However, the mixed‐integer nature of the problem makes a pure triangulation approach impractical on fine grids. As an alternative, we study the ways of approximating copula densities with triangular functions which guarantees that the estimator is a valid copula density. The family of resulting estimators can be viewed as a non‐parametric MLE of B‐spline coefficients on possibly non‐equally spaced grids under simple linear constraints. As such, it can be easily solved using standard convex optimization tools and allows for a degree of localization. A simulation study shows an attractive performance of the estimator in small samples and compares it with some of the leading alternatives. We demonstrate empirical relevance of our approach using three applications. In the first application, we investigate how the body mass index of children depends on that of parents. In the second application, we construct a bivariate copula underlying the Gibson paradox from macroeconomics. In the third application, we show the benefit of using our approach in testing the null of independence against the alternative of an arbitrary dependence pattern.  相似文献   

17.
Sample surveys are widely used to obtain information about totals, means, medians and other parameters of finite populations. In many applications, similar information is desired for subpopulations such as individuals in specific geographic areas and socio‐demographic groups. When the surveys are conducted at national or similarly high levels, a probability sampling can result in just a few sampling units from many unplanned subpopulations at the design stage. Cost considerations may also lead to low sample sizes from individual small areas. Estimating the parameters of these subpopulations with satisfactory precision and evaluating their accuracy are serious challenges for statisticians. To overcome the difficulties, statisticians resort to pooling information across the small areas via suitable model assumptions, administrative archives and census data. In this paper, we develop an array of small area quantile estimators. The novelty is the introduction of a semiparametric density ratio model for the error distribution in the unit‐level nested error regression model. In contrast, the existing methods are usually most effective when the response values are jointly normal. We also propose a resampling procedure for estimating the mean square errors of these estimators. Simulation results indicate that the new methods have superior performance when the population distributions are skewed and remain competitive otherwise.  相似文献   

18.
Comparing occurrence rates of events of interest in science, business, and medicine is an important topic. Because count data are often under‐reported, we desire to account for this error in the response when constructing interval estimators. In this article, we derive a Bayesian interval for the difference of two Poisson rates when counts are potentially under‐reported. The under‐reporting causes a lack of identifiability. Here, we use informative priors to construct a credible interval for the difference of two Poisson rate parameters with under‐reported data. We demonstrate the efficacy of our new interval estimates using a real data example. We also investigate the performance of our newly derived Bayesian approach via simulation and examine the impact of various informative priors on the new interval.  相似文献   

19.
In this article, we propose a new method for estimating the randomisation (design‐based) mean squared error (DMSE) of model‐dependent small area predictors. Analogously to classical survey sampling theory, the DMSE considers the finite population values as fixed numbers and accounts for the MSE of small area predictors over all possible sample selections. The proposed method models the true DMSE as computed for synthetic populations and samples drawn from them, as a function of known statistics and then applies the model to the original sample. Several simulation studies for the linear area‐level model and the unit‐level mixed logistic model illustrate the performance of the proposed method and compare it with the performance of other DMSE estimators proposed in the literature.  相似文献   

20.
Microeconomic data often have within‐cluster dependence, which affects standard error estimation and inference. When the number of clusters is small, asymptotic tests can be severely oversized. In the instrumental variables (IV) model, the potential presence of weak instruments further complicates hypothesis testing. We use wild bootstrap methods to improve inference in two empirical applications with these characteristics. Building from estimating equations and residual bootstraps, we identify variants robust to the presence of weak instruments and a small number of clusters. They reduce absolute size bias significantly and demonstrate that the wild bootstrap should join the standard toolkit in IV and cluster‐dependent models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号