首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 265 毫秒
1.
In survey sampling, auxiliary information on the population is often available. The aim of this paper is to develop a method which allows one to take into account such auxiliary information at the estimation stage by means of conditional bias adjustment. The basic idea is to attempt to construct a conditionally unbiased estimator. Four estimators that have a small conditional bias with respect to a statistic are proposed. It is shown that many of the estimators used in the literature in the case of simple random sampling can be obtained by using this estimation principle. The problem of simple random sampling with replacement, poststratification, and adjustment of a 2 x 2 dimensional contingency table to marginal totals are discussed in the conditional framework. Finally it is shown that the regression estimator can be viewed as an approximation of an application of the conditional principle.  相似文献   

2.
We establish the inferential properties of the mean-difference estimator for the average treatment effect in randomised experiments where each unit in a population is randomised to one of two treatments and then units within treatment groups are randomly sampled. The properties of this estimator are well understood in the experimental design scenario where first units are randomly sampled and then treatment is randomly assigned but not for the aforementioned scenario where the sampling and treatment assignment stages are reversed. We find that the inferential properties of the mean-difference estimator under this experimental design scenario are identical to those under the more common sample-first-randomise-second design. This finding will bring some clarifications about sampling-based randomised designs for causal inference, particularly for settings where there is a finite super-population. Finally, we explore to what extent pre-treatment measurements can be used to improve upon the mean-difference estimator for this randomise-first-sample-second design. Unfortunately, we find that pre-treatment measurements are often unhelpful in improving the precision of average treatment effect estimators under this design, unless a large number of pre-treatment measurements that are highly associative with the post-treatment measurements can be obtained. We confirm these results using a simulation study based on a real experiment in nanomaterials.  相似文献   

3.
A formula is presented for an unbiased estimator for the variance of an unbiased estimator of a survey population total as well as for an unbiased estimator of its variance based on sampling in two-stages following Rao et al. J Roy Stat Soc B 24: 482–491 (1962) scheme in both stages when the originally selected units in both stages cannot be fully covered in the survey but are to be randomly sub-sampled. The development is helpful to tackle non-responses if assumed to have occurred at random in either or both the stages  相似文献   

4.
We study the generalized bootstrap technique under general sampling designs. We focus mainly on bootstrap variance estimation but we also investigate the empirical properties of bootstrap confidence intervals obtained using the percentile method. Generalized bootstrap consists of randomly generating bootstrap weights so that the first two (or more) design moments of the sampling error are tracked by the corresponding bootstrap moments. Most bootstrap methods in the literature can be viewed as special cases. We discuss issues such as the choice of the distribution used to generate bootstrap weights, the choice of the number of bootstrap replicates, and the potential occurrence of negative bootstrap weights. We first describe the generalized bootstrap for the linear Horvitz‐Thompson estimator and then consider non‐linear estimators such as those defined through estimating equations. We also develop two ways of bootstrapping the generalized regression estimator of a population total. We study in greater depth the case of Poisson sampling, which is often used to select samples in Price Index surveys conducted by national statistical agencies around the world. For Poisson sampling, we consider a pseudo‐population approach and show that the resulting bootstrap weights capture the first three design moments of the sampling error. A simulation study and an example with real survey data are used to illustrate the theory.  相似文献   

5.
Counting the number of units is not always practical during the sampling of particulate materials: it is often much easier to sample a fixed volume or fixed mass of particles. Hence, a class of sampling designs is proposed which leads to samples that have approximately a constant mass or a constant volume. For these sampling designs, estimators were derived which are a ratio of arbitrary sample totals. A Taylor expansion was used to obtain a first-order approximation for the expected value and variance in the limit of a large batch-to-sample size ratio. Furthermore, a π -estimator for a ratio of batch totals was found by deriving expressions for the first- and second-order inclusion probabilities. Practical application of the π -estimator is limited because it requires inaccessible batch information. However, when the denominator of the estimated batch ratio is the batch size, the π -estimator becomes equal to a sample total divided by the sample size in the limit of a large sample-to-particle size ratio. As a consequence, the obtained sample ratio becomes an unbiased estimator for the corresponding batch ratio. Retaining unbiasedness, the Horvitz–Thompson estimator for the variance, which also contains inaccessible batch information, is replaced by an estimator containing sample information only. Practical application of this estimator is illustrated for the sampling of slag, produced during the production of steel.  相似文献   

6.
In this paper, an alternative sampling procedure that is a mixture of simple random sampling and systematic sampling is proposed. It results in uniform inclusion probabilities for all individual units and positive inclusion probabilities for all pairs of units. As a result, the proposed sampling procedure enables us to estimate the population mean unbiasedly using the ordinary sample mean, and to provide an unbiased estimator of its sampling variance. It is also found that the suggested sampling procedure performs well especially when the size of simple random sample is small. Received August 2001  相似文献   

7.
Ove Frank 《Metrika》1970,16(1):32-42
Summary Statistical problems in connection with classified data, stratified sampling, cluster sampling and two-stage sampling may be formulated in terms of overlapping subpopulations instead of disjoint classes, strata, clusters or primary sampling units. The introduction in section 1 serves to unify the notation to be used and to exemplify the type of problems that are to be generalized. Samples where the units are classified into overlapping classes are studied in section 2. The applicability is illustrated with an estimation problem in connection with destructive tests. Section 3 treats sampling from overlapping strata and estimation of the sizes of the intersections. Section 4 discusses problems in conjunction with sampling of overlapping clusters. Graphs or networks representing populations with binary relationships are used to exemplify sampling of overlapping clusters. Section 5 is devoted to some examples of two-stage sampling where the primary sampling units are overlapping subsets.  相似文献   

8.
Calibration Estimation in Survey Sampling   总被引:1,自引:0,他引:1  
Calibration estimation, where the sampling weights are adjusted to make certain estimators match known population totals, is commonly used in survey sampling. The generalized regression estimator is an example of a calibration estimator. Given the functional form of the calibration adjustment term, we establish the asymptotic equivalence between the functional-form calibration estimator and an instrumental variable calibration estimator where the instrumental variable is directly determined from the functional form in the calibration equation. Variance estimation based on linearization is discussed and applied to some recently proposed calibration estimators. The results are extended to the estimator that is a solution to the calibrated estimating equation. Results from a limited simulation study are presented.  相似文献   

9.
Social and economic studies are often implemented as complex survey designs. For example, multistage, unequal probability sampling designs utilised by federal statistical agencies are typically constructed to maximise the efficiency of the target domain level estimator (e.g. indexed by geographic area) within cost constraints for survey administration. Such designs may induce dependence between the sampled units; for example, with employment of a sampling step that selects geographically indexed clusters of units. A sampling‐weighted pseudo‐posterior distribution may be used to estimate the population model on the observed sample. The dependence induced between coclustered units inflates the scale of the resulting pseudo‐posterior covariance matrix that has been shown to induce under coverage of the credibility sets. By bridging results across Bayesian model misspecification and survey sampling, we demonstrate that the scale and shape of the asymptotic distributions are different between each of the pseudo‐maximum likelihood estimate (MLE), the pseudo‐posterior and the MLE under simple random sampling. Through insights from survey‐sampling variance estimation and recent advances in computational methods, we devise a correction applied as a simple and fast postprocessing step to Markov chain Monte Carlo draws of the pseudo‐posterior distribution. This adjustment projects the pseudo‐posterior covariance matrix such that the nominal coverage is approximately achieved. We make an application to the National Survey on Drug Use and Health as a motivating example and we demonstrate the efficacy of our scale and shape projection procedure on synthetic data on several common archetypes of survey designs.  相似文献   

10.
Summary: Suppose for a homogeneous linear unbiased function of the sampled first stage unit (fsu)-values taken as an estimator of a survey population total, the sampling variance is expressed as a homogeneous quadratic function of the fsu-values. When the fsu-values are not ascertainable but unbiased estimators for them are separately available through sampling in later stages and substituted into the estimator, Raj (1968) gave a simple variance estimator formula for this multi-stage estimator of the population total. He requires that the variances of the estimated fsu-values in sampling at later stages and their unbiased estimators are available in certain `simple forms'. For the same set-up Rao (1975) derived an alternative variance estimator when the later stage sampling variances have more ‘complex forms’. Here we pursue with Raj's (1968) simple forms to derive a few alternative variance and mean square error estimators when the condition of homogeneity or unbiasedness in the original estimator of the total is relaxed and the variance of the original estimator is not expressed as a quadratic form.  We illustrate a particular three-stage sampling strategy and present a simulation-based numerical exercise showing the relative efficacies of two alternative variance estimators. Received: 19 February 1999  相似文献   

11.
A common problem in survey sampling is to compare two cross‐sectional estimates for the same study variable taken from two different waves or occasions. These cross‐sectional estimates often include imputed values to compensate for item non‐response. The estimation of the sampling variance of the estimator of change is useful to judge whether the observed change is statistically significant. Estimating the variance of a change is not straightforward because of the rotation in repeated surveys and imputation. We propose using a multivariate linear regression approach and show how it can be used to accommodate the effect of rotation and imputation. The regression approach gives a design‐consistent estimation of the variance of change when the sampling fraction is small. We illustrate the proposed approach using random hot‐deck imputation, although the proposed estimator can be implemented with other imputation techniques.  相似文献   

12.
Statistical Inference in Nonparametric Frontier Models: The State of the Art   总被引:14,自引:8,他引:6  
Efficiency scores of firms are measured by their distance to an estimated production frontier. The economic literature proposes several nonparametric frontier estimators based on the idea of enveloping the data (FDH and DEA-type estimators). Many have claimed that FDH and DEA techniques are non-statistical, as opposed to econometric approaches where particular parametric expressions are posited to model the frontier. We can now define a statistical model allowing determination of the statistical properties of the nonparametric estimators in the multi-output and multi-input case. New results provide the asymptotic sampling distribution of the FDH estimator in a multivariate setting and of the DEA estimator in the bivariate case. Sampling distributions may also be approximated by bootstrap distributions in very general situations. Consequently, statistical inference based on DEA/FDH-type estimators is now possible. These techniques allow correction for the bias of the efficiency estimators and estimation of confidence intervals for the efficiency measures. This paper summarizes the results which are now available, and provides a brief guide to the existing literature. Emphasizing the role of hypotheses and inference, we show how the results can be used or adapted for practical purposes.  相似文献   

13.
A typical Business Register (BR) is mainly based on administrative data files provided by organisations that produce them as a by-product of their function. Such files do not necessarily yield a perfect Business Register. A good BR should have the following characteristics: (1) It should reflect the complex structures of businesses with multiple activities, in multiple locations or with multiple legal entities; (2) It should be free of duplication, extraneous or missing units; (3) It should be properly classified in terms of key stratification variables, including size, geography and industry; (4) It should be easily updateable to represent the "newer" business picture, and not lag too much behind it. In reality, not all these desirable features are fully satisfied, resulting in a universe that has missing units, inaccurate structures, as well as improper contact information, to name a few defects.
These defects can be compensated by using sampling and estimation procedures. For example, coverage can be improved using multiple frame techniques, and the sample size can be increased to account for misclassification of units and deaths on the register. At the time of estimation, auxiliary information can be used in a variety of ways. It can be used to impute missing variables, to treat outliers, or to create synthetic variables obtained via modelling. Furthermore, time lags between the birth of units and the time that they are included on the register can be accounted for appropriately inflating the design-based estimates.  相似文献   

14.
In many survey situations simple random sampling of units and estimation of a total of interest by the expansion estimator are attractive methods, at least at first sight. Considering cost aspects suggests rather to use multiple stage sampling which, in general, is cheaper, but less effective. The design effect is an adequate criterion of the decrease of efficiency. We discuss this criterion for clusters (primary units) of equal size and derive exact conditions for a decrease of efficiency. The equality condition for cluster sizes seems not to be very restrictive, because in many cases one will be interested in clusters of approximately the same size, or, if sizes differ essentially, the clusters are partitioned into strata according to their sizes and the procedures for different strata are independent, each dealing with clusters of equal size or nearly so. In the context considered the use of the Horvitz–Thompson estimator is quite general. We examine a class of estimators with the Horvitz–Thompson estimator and a straight forward modification of it as special elements. As for the design effect all elements of the class are very similar, as for other aspects such as admissibility there are remarkable differences.  相似文献   

15.
Early survey statisticians faced a puzzling choice between randomized sampling and purposive selection but, by the early 1950s, Neyman's design-based or randomization approach had become generally accepted as standard. It remained virtually unchallenged until the early 1970s, when Royall and his co-authors produced an alternative approach based on statistical modelling. This revived the old idea of purposive selection, under the new name of “balanced sampling”. Suppose that the sampling strategy to be used for a particular survey is required to involve both a stratified sampling design and the classical ratio estimator, but that, within each stratum, a choice is allowed between simple random sampling and simple balanced sampling; then which should the survey statistician choose? The balanced sampling strategy appears preferable in terms of robustness and efficiency, but the randomized design has certain countervailing advantages. These include the simplicity of the selection process and an established public acceptance that randomization is “fair”. It transpires that nearly all the advantages of both schemes can be secured if simple random samples are selected within each stratum and a generalized regression estimator is used instead of the classical ratio estimator.  相似文献   

16.
The successive sampling is a known technique that can be used in longitudinal surveys to estimate population parameters and measurements of difference or change of a study variable. The paper discusses the estimation of quantiles for the current occasion based on sampling in two successive occasions and using p-auxiliary variables obtained of the previous occasion. A multivariate ratio estimator from the matched portion is used to provide the optimum estimate of a quantile by weighting the estimates inversely to derived optimum weights. Its properties are studied under large–sample approximation and the expressions of the variances are established. The behavior of these asymptotic variances is analyzed on the basis of data from natural populations. A simulation study is also used to measure the precision of the proposed estimator.  相似文献   

17.
M. P. Singh 《Metrika》1967,11(1):199-205
Summary In this paper the possibility of gain in efficiency in systematic sampling as compared to simple random sampling has been considered when a ratio or product estimator is used to improve upon the conventional unbiased estimator. The expression for the variance of the estimators are derived for multistage design where systematic selection is used at the ultimate-stage with any probability scheme at the previous stages. In particular the results for the uni-stage systematic sampling and for two-stage sampling with systematic selection at the second-stage have been obtained in section 3.  相似文献   

18.
Mean profiles are widely used as indicators of the electricity consumption habits of customers. Currently, in Électricité De France (EDF), class load profiles are estimated using point‐wise mean profiles. Unfortunately, it is well known that the mean is highly sensitive to the presence of outliers, such as one or more consumers with unusually high‐levels of consumption. In this paper, we propose an alternative to the mean profile: the L 1 ‐ median profile which is more robust. When dealing with large data sets of functional data (load curves for example), survey sampling approaches are useful for estimating the median profile avoiding storing the whole data. We propose here several sampling strategies and estimators to estimate the median trajectory. A comparison between them is illustrated by means of a test population. We develop a stratification based on the linearized variable which substantially improves the accuracy of the estimator compared to simple random sampling without replacement. We suggest also an improved estimator that takes into account auxiliary information. Some potential areas for future research are also highlighted.  相似文献   

19.
Mixing of direct, ratio, and product method estimators   总被引:1,自引:0,他引:1  
In a paper by S rivenkataramana T racy [4], four methods of estimating a population total Y with the use of an auxiliary variable were introduced, given a random sample without replacement from that population. These methods were "built around the idea that estimating the population total is essentially equivalent to estimating the total corresponding to the non-sample units, since that corresponding to the sample units is known once the sample is drawn and measurements are made on it."
However, in the case of small sampling fractions the nonsample units constitute most of the population and no great improvement over the traditional estimators is to be expected. Therefore the methods are compared with the existing estimators and it turns out that they are special cases of the "mixing estimators", introduced in this paper. The latter estimators can be made asymptotically equivalent to the regression estimator and are therefore asymptotically superior to all other estimators. An exact comparison is carried out on the artificial example given in [4]. The statement in this paper that "the proposed estimators are to be preferred to the regression estimator for., superiority of performance in the case of small samples" is evidently misleading. Finally a comparison is made with other "mixing-type" estimators, that can prove very useful in practice.  相似文献   

20.
P. Mukhopadhyay 《Metrika》1975,22(1):119-127
The problem of constructing a sampling design with the value of the sum of second order inclusion probabilities attaining its lower bound for non-integral values of the expected effective size of a sample in the design has been considered in this paper. If the values of the characteristic of interest on all the units in the population are non-negative the design is admissible (in the sense of variance) with respect to Horvitz-Thompson estimator in the class of designs with the same set of values of the first order inclusion probabilities of the units. Again such a design is best to use Horvitz-Thompson estimator of population total in the sense of smallest average variance of the estimator under a special superpopulatio model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号