期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Model-based variance estimation under unequal probability sampling

P. A. Patel R. D. Chaudhari 《Metrika》2008,67(2):171-187

This paper deals with a model-based variance estimation of the Horvitz–Thompson (HT) estimator when auxiliary information is available. A small simulation study is carried out to illustrate and establish some of the findings. 相似文献

2.

Estimators for particulate sampling derived from a multinomial distribution

B. Geelhoed H. J. Glass 《Statistica Neerlandica》2004,58(1):57-74

Counting the number of units is not always practical during the sampling of particulate materials: it is often much easier to sample a fixed volume or fixed mass of particles. Hence, a class of sampling designs is proposed which leads to samples that have approximately a constant mass or a constant volume. For these sampling designs, estimators were derived which are a ratio of arbitrary sample totals. A Taylor expansion was used to obtain a first-order approximation for the expected value and variance in the limit of a large batch-to-sample size ratio. Furthermore, a π -estimator for a ratio of batch totals was found by deriving expressions for the first- and second-order inclusion probabilities. Practical application of the π -estimator is limited because it requires inaccessible batch information. However, when the denominator of the estimated batch ratio is the batch size, the π -estimator becomes equal to a sample total divided by the sample size in the limit of a large sample-to-particle size ratio. As a consequence, the obtained sample ratio becomes an unbiased estimator for the corresponding batch ratio. Retaining unbiasedness, the Horvitz–Thompson estimator for the variance, which also contains inaccessible batch information, is replaced by an estimator containing sample information only. Practical application of this estimator is illustrated for the sampling of slag, produced during the production of steel. 相似文献

3.

Sampling from overlapping subpopulations

Ove Frank 《Metrika》1970,16(1):32-42

Summary Statistical problems in connection with classified data, stratified sampling, cluster sampling and two-stage sampling may be formulated in terms of overlapping subpopulations instead of disjoint classes, strata, clusters or primary sampling units. The introduction in section 1 serves to unify the notation to be used and to exemplify the type of problems that are to be generalized. Samples where the units are classified into overlapping classes are studied in section 2. The applicability is illustrated with an estimation problem in connection with destructive tests. Section 3 treats sampling from overlapping strata and estimation of the sizes of the intersections. Section 4 discusses problems in conjunction with sampling of overlapping clusters. Graphs or networks representing populations with binary relationships are used to exemplify sampling of overlapping clusters. Section 5 is devoted to some examples of two-stage sampling where the primary sampling units are overlapping subsets. 相似文献

4.

Multinomial distributions applied to random sampling of particulate materials

B. Geelhoed & H. J. Glass 《Statistica Neerlandica》2002,56(1):58-76

When sampling a batch consisting of particulate material, the distribution of a sample estimator can be characterized using knowledge about the sample drawing process. With Bernoulli sampling, the number of particles in the sample is binomially distributed. Because this is rarely realized in practice, we propose a sampling design in which the possible samples have a nearly equal mass. Expected values and variances of the sample estimator are calculated. It is shown that the sample estimator becomes identical to the Horvitz–Thompson estimator in the case of a large batch-to-sample mass ratio and a large sample mass. Simulations and experiments were performed to test the theory. Simulations confirm that the round-off error due to the discrete nature of particles is negligible for large sample sizes. Sampling experiments were carried out with a mixture of PolyPropylene (PP) and PolyTetraFluorEthylene (PTFE) spheres suspended in a viscous medium. The measured and theoretical variations are in good agreement. 相似文献

5.

Optimal allocation of the sample size to strata under box constraints 总被引：1，自引：1，他引：0

Siegfried Gabler Matthias Ganninger Ralf Münnich 《Metrika》2012,75(2):151-161

In stratified random sampling without replacement boundary conditions, such as the sample sizes within strata shall not exceed the population sizes in the respective strata, have to be considered. Stenger and Gabler (Metrika, 61:137–156, 2005) have shown a solution that satisfies upper boundaries of sample fractions within the strata. However, in modern applications one may wish to guarantee also minimal sampling fractions within strata in order to allow for reasonable separate estimations. Within this paper, an optimal allocation in the Neyman-Tschuprov sense is developed which satisfies upper and lower bounds of the sample sizes within strata. Further, a stable algorithm is given which ensures optimality. The resulting sample allocation enables users to bound design weights within stratified random sampling while considering optimality in allocation. 相似文献

6.

On the calibration of design weights using a displacement function

Sarjinder?Singh Email author 《Metrika》2012,75(1):85-107

In the present investigation, we propose a new method to calibrate the estimator of the general parameter of interest in survey sampling. We demonstrate that the linear regression estimator due to Hansen et al. (Sample Survey Method and Theory. Wiley, NY, 1953) is a special case of this. We reconfirm that the sum of calibrated weights has to be set equal to sum of the design weights within a given sample as shown in Singh (Advanced sampling theory with applications: How Michael ‘selected’ Amy, Vol. 1 and 2. Kluwer, The Netherlands, pp 1–1247, 2003; Proceedings of the American Statistical Association, Survey Method Section [CD-ROM], Toronto, Canada: American Statistical Association, pp 4382–4389, 2004; Metrika:1–18, 2006a; Presented at INTERFACE 2006, Pasadena, CA, USA, 2006b) and Stearns and Singh (Presented at Joint Statistical Meeting, MN, USA (Available on the CD), 2005; Comput Stat Data Anal 52:4253–4271, 2008). Thus, it shows that the Sir. R.A. Fisher’s brilliant idea of keeping sum of observed frequencies equal to that of expected frequencies leads to a “Honest-Balance” while weighing design weights in survey sampling. The major benefit of the proposed new estimator is that it always works unlike the pseudo empirical likelihood estimators listed in Owen (Empirical Likelihood. Chapman & Hall, London, 2001), Chen and Sitter (Stat Sin 9:385–406, 1999) and Wu (Sur Methodol 31(2):239–243, 2005). The main endeavor of this paper is to bring a change in the existing calibration technology, which is based on only positive distance functions, with a displacement function that has the flexibility of taking positive, negative, or zero value. At the end, the proposed technology has been compared with its competitors under several kinds of linear and non-linear non-parametric models using an extensive simulation study. A couple of open questions are raised. 相似文献

7.

Estimation from two-stage unequal probability sampling with missing units

Arijit Chaudhuri Amitava Saha 《Metrika》2006,63(1):33-41

A formula is presented for an unbiased estimator for the variance of an unbiased estimator of a survey population total as well as for an unbiased estimator of its variance based on sampling in two-stages following Rao et al. J Roy Stat Soc B 24: 482–491 (1962) scheme in both stages when the originally selected units in both stages cannot be fully covered in the survey but are to be randomly sub-sampled. The development is helpful to tackle non-responses if assumed to have occurred at random in either or both the stages 相似文献

8.

Combining random sampling and census strategies - Justification of inclusion probabilities equal to 1

Horst Stenger Siegfried Gabler 《Metrika》2005,61(2):137-156

Very often values of a size variable are known for the elements of a population we want to sample. For example, the elements may be clusters, the size variable denoting the number of units in a cluster. Then, it is quite usual to base the selection of elements on inclusion probabilities which are proportionate to the size values. To estimate the total of all values of an unknown variable for the units in the population of interest (i.e. for the units contained in the clusters) we may use weights, e.g. inverse inclusion probabilities. We want to clarify these ideas by the minimax principle. Especially, we will show that the use of inclusion probabilities equal to 1 is recommendable for units with high values of the size measure. AMS Classification 2000: Primary 62D05. Secondary 62C20 相似文献

9.

Stratification of Skewed Populations: A review

Jane M. Horgan 《Revue internationale de statistique》2006,74(1):67-76

When Dalenius provided a set of equations for the determination of stratum boundaries of a single auxiliary variable, that minimise the variance of the Horvitz–Thompson estimator of the mean or total under Neyman allocation for a fixed sample size, he pointed out that, though mathematically correct, those equations are troublesome to solve. Since then there has been a proliferation of approximations of an iterative nature, or otherwise cumbersome, tendered for this problem; many of these approximations assume a uniform distribution within strata, and, in the case of skewed populations, that all strata have the same relative variation. What seems to have been missed is that the combination of these two assumptions offers a much simpler and equally effective method of subdivision for skewed populations; take the stratum boundaries in geometric progression. 相似文献

10.

Dispersion of growth paths of macroeconomic models in thermodynamic limits: two-parameter Poisson–Dirichlet models

Masanao Aoki 《Journal of Economic Interaction and Coordination》2008,3(1):3-13

This paper discusses dispersion of growth patterns of macroeconomic models in thermodynamic limits. More specifically, the paper shows that the coefficients of variations of the total numbers of clusters and the numbers of clusters of specific sizes of one- and two-parameter Poisson–Dirichlet models behave qualitatively differently in the thermodynamic limits. The coefficients of variations of the numbers of clusters in the former class of distributions are all self-averaging, while the those in the latter class are all non-self averaging. In other words, dispersions or variations of growth rates about the means do not vanish in the two-parameter version of the model, while they do in the one-parameter version in the thermodynamic limits. The paper ends by pointing out other models, such as triangular urn models, may converge to Mittag–Leffler distributions which exhibit non-self-averaging behavior for certain parameter combinations. The author is grateful for many helps he received from H. Yoshikawa, and M. Sibuya. 相似文献

11.

Using Remote Sensing for Agricultural Statistics 总被引：7，自引：0，他引：7

Elisabetta Carfagna F. Javier Gallego 《Revue internationale de statistique》2005,73(3):389-404

Remote sensing can be a valuable tool for agricultural statistics when area frames or multiple frames are used. At the design level, remote sensing typically helps in the definition of sampling units and the stratification, but can also be exploited to optimise the sample allocation and size of sampling units. At the estimator level, classified satellite images are generally used as auxiliary variables in a regression estimator or for estimators based on confusion matrixes. The most often used satellite images are LANDSAT-TM and SPOT-XS. In general, classified or photo-interpreted images should not be directly used to estimate crop areas because the proportion of pixels classified into the specific crop is often strongly biased. Vegetation indexes computed from satellite images can give in some cases a good indication of the potential crop yield. 相似文献

12.

Admissible unbiased variance estimation in finite population sampling under randomized response

S. Sengupta D. Kundu 《Metrika》1991,38(1):71-82

LetP be the proportion of units in a finite population possessing a sensitive attribute. We prove the admissibility of (i) an unbiased estimator of the variance of a general homogeneous linear unbiased estimator ofP and (ii) an unbiased estimator of the population varianceP(1−P), based on an arbitrary but fixed sampling design, under the randomized response plans due to Warner (1965) and Eriksson (1973). Admissibility of an unbiased strategy for estimating the population variance is also established. 相似文献

13.

Estimation of correlation for a finite universe

Dr. J. C. Koop 《Metrika》1970,15(1):105-109

Summary The formula for thePearsonion correlation coefficient, based on a simple random sample, is a consistent estimator of the parent correlation between two given measurable characteristics of the elements of a finite universe. However, when the universe is stratified, and the elements in each stratum are drawn without replacement and with equal probabilities at each draw, the formula for a consistent estimator is much more complex. Generally speaking, the formula for a consistent estimator of the parent correlation varies with the sampling design. The results of this paper are relevant to the analysis of sociological data obtained through sample surveys. In the literature of the theory of statistical sampling the problem of estimating the correlation between pairs of variate values of the identifiable elements constituting a universe has so far not been considered. Needless to say the solution of this problem has an important bearing on sociological studies based on sample surveys. 相似文献

14.

On kernel nonparametric regression designed for complex survey data

Torsten Harms Pierre Duchesne 《Metrika》2010,72(1):111-138

In this article, we consider nonparametric regression analysis between two variables when data are sampled through a complex survey. While nonparametric regression analysis has been widely used with data that may be assumed to be generated from independently and identically distributed (iid) random variables, the methods and asymptotic analyses established for iid data need to be extended in the framework of complex survey designs. Local polynomial regression estimators are studied, which include as particular cases design-based versions of the Nadaraya–Watson estimator and of the local linear regression estimator. In this paper, special emphasis is given to the local linear regression estimator. Our estimators incorporate both the sampling weights and the kernel weights. We derive the asymptotic mean squared error (MSE) of the kernel estimators using a combined inference framework, and as a corollary consistency of the estimators is deduced. Selection of a bandwidth is necessary for the resulting estimators; an optimal bandwidth can be determined, according to the MSE criterion in the combined mode of inference. Simulation experiments are conducted to illustrate the proposed methodology and an application with the Canadian survey of labour and income dynamics is presented. 相似文献

15.

On two properties of an unequal probability sampling scheme

Dr. Arun Kumar Adhikary Dr. Arijit Chaudhuri 《Metrika》1989,36(1):161-166

Summary For an inclusion probability proportional to size (IPPS) sampling scheme recently proposed by Saxena, Singh and Srivastava (1986), it is shown that under certain simple verifiable conditions (1) the Horvitz-Thompson (1952) estimator based on it has a smaller variance than the variance of the Hansen-Hurwitz (1943) estimator based on probability proportional to size (PPS) sampling with replacement (WR) both involving the same size-measures and the expected sample size in the former being equal to the number of draws in the latter and (2) the Yates-Grundy (1953) estimator for the variance of the Horvitz-Thompson estimator based on this IPPS scheme is uniformly non-negative. 相似文献

16.

On consistency of redescending M-kernel smoothers

Martin?Hillebrand Email author Christine?H.?Müller 《Metrika》2006,63(1):71-90

M-estimators and M-kernel estimators with a redescending ψ-function are not in general consistent. This is often handled by means of coupling the estimator to a consistent one. Coupling the estimator to the (inconsistent) starting point improves the jump preserving properties. However, the consistency depends heavily on the shape of the density of the residuals. This paper shows inconsistency under convenient conditions as well as consistency – even at jump points – under somewhat stronger conditions. Research supported by the Friedrich Ebert Foundation and by grant Mu 1031/4-1/2 of the Deutsche Forschungsgemeinschaft 相似文献

17.

Uncertainty Estimation for Pseudo‐Bayesian Inference Under Complex Sampling

Matthew R. Williams Terrance D. Savitsky 《Revue internationale de statistique》2021,89(1):72-107

Social and economic studies are often implemented as complex survey designs. For example, multistage, unequal probability sampling designs utilised by federal statistical agencies are typically constructed to maximise the efficiency of the target domain level estimator (e.g. indexed by geographic area) within cost constraints for survey administration. Such designs may induce dependence between the sampled units; for example, with employment of a sampling step that selects geographically indexed clusters of units. A sampling‐weighted pseudo‐posterior distribution may be used to estimate the population model on the observed sample. The dependence induced between coclustered units inflates the scale of the resulting pseudo‐posterior covariance matrix that has been shown to induce under coverage of the credibility sets. By bridging results across Bayesian model misspecification and survey sampling, we demonstrate that the scale and shape of the asymptotic distributions are different between each of the pseudo‐maximum likelihood estimate (MLE), the pseudo‐posterior and the MLE under simple random sampling. Through insights from survey‐sampling variance estimation and recent advances in computational methods, we devise a correction applied as a simple and fast postprocessing step to Markov chain Monte Carlo draws of the pseudo‐posterior distribution. This adjustment projects the pseudo‐posterior covariance matrix such that the nominal coverage is approximately achieved. We make an application to the National Survey on Drug Use and Health as a motivating example and we demonstrate the efficacy of our scale and shape projection procedure on synthetic data on several common archetypes of survey designs. 相似文献

18.

Model assisted survey sampling strategy in two phases

Chaudhuri Arijit Roy Debesh 《Metrika》1994,41(1):355-362

Postulating a super-population regression model connecting a size variable, a cheaply measurable variable and an expensively observable variable of interest, an asymptotically optimal double sampling strategy to estimate the survey population total of the third variable is specified. To render it practicable, unknown model-parameters in the optimal estimator are replaced by appropriate statistics. The resulting generalized regression estimator is then shown to have a model-cum-asymptotic design based expected square error equal to that of the asymptotically optimum estimator itself. An estimator for design variance of the estimator is also proposed. 相似文献

19.

On the advantages of economically designed the Hotelling’s T2 control chart with variable sample sizes and sampling intervals

Alireza Faraz R. B. Kazemzadeh Ahmad Parsian M. B. Moghadam 《Quality and Quantity》2012,46(1):39-53

Faraz and Parsian (Statistical Paper, 47: 569–593, 2006) have shown that the double warning lines (DWL) scheme detects process shifts more quickly than the other variable ratio sampling schemes such as variable sample sizes (VSS), variable sampling intervals (VSI) and variable sample sizes and sampling intervals (VSSVSI). In this paper, the DWLT²control chart for monitoring the process mean vector is economically designed. The cost model proposed by Costa and Rahim (Journal of Applied Statistics, 28: 875–885, 2001) is used here and is minimized through a genetic algorithm (GA) approach. Then the effects of the model parameters on the chart parameters and resulting operating loss is studied and finally a comparison between all possible variable ratio sampling (VRS) schemes are made to choose the best option economically. 相似文献

20.

Mean square error estimation in multi-stage sampling

Arijit Chaudhuri Arun Kumar Adhikary Shankar Dihidar 《Metrika》2000,52(2):115-131

Summary: Suppose for a homogeneous linear unbiased function of the sampled first stage unit (fsu)-values taken as an estimator of a survey population total, the sampling variance is expressed as a homogeneous quadratic function of the fsu-values. When the fsu-values are not ascertainable but unbiased estimators for them are separately available through sampling in later stages and substituted into the estimator, Raj (1968) gave a simple variance estimator formula for this multi-stage estimator of the population total. He requires that the variances of the estimated fsu-values in sampling at later stages and their unbiased estimators are available in certain `simple forms'. For the same set-up Rao (1975) derived an alternative variance estimator when the later stage sampling variances have more ‘complex forms’. Here we pursue with Raj's (1968) simple forms to derive a few alternative variance and mean square error estimators when the condition of homogeneity or unbiasedness in the original estimator of the total is relaxed and the variance of the original estimator is not expressed as a quadratic form. We illustrate a particular three-stage sampling strategy and present a simulation-based numerical exercise showing the relative efficacies of two alternative variance estimators. Received: 19 February 1999 相似文献