首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 171 毫秒
The recently repeated assertion that in correlation analysis it makes little difference whether one variable (x2) is used instead of another one (x3), provided the coefficient of correlation (r23) between x2 and x3 is high, is scrutinized.
To that purpose the ranges of coefficients of correlation with respect to the substitute variable are expressed in formula 3. Moreover, by way of example, extreme values of coefficients of simple correlation (r13 and r34), of multiple correlation (R1.34 and R3.14) and of regression (α13 and α14, α31 and α34) relating to the substitute variable, are calculated on the basis of empirical values of coefficients of simple correlation relating to the substituted and the remaining variables.
The outcome of those calculations are summarized in the tables 1 and 3, and in the graph.
Table 1 presents ranges of r13 for given values of r12 and r23, table 3 shows extreme values of coefficients of single and multiple correlation and regression in case an additional variable x4 is introduced and r12, r14, r24 and r23 are given. The graph shows an ellipse as the boundary of the inner closed domain of compatible values of r13 and r34.
Those results clearly indicate the need for caution in substituting one variable by another.  相似文献   

Suppose X1, X2, Xm is a random sample of size m from a population with probability density function f (x), x > 0), and let X1, m< × 2, m <… < Xm, m be the corresponding order statistics.
We assume m is an integer-valued random variable with P( m = k ) = p (1- p )k-1, k = 1,2,… and 0 < p < 1. Two characterizations of the exponential distribution are given based on the distributional properties of Xl, m.  相似文献   

Consider an ordered sample (1), (2),…, (2n+1) of size 2 n +1 from the normal distribution with parameters μ and . We then have with probability one
(1) < (2) < … < (2 n +1).
The random variable
n =(n+1)/(2n+1)-(1)
that can be described as the quotient of the sample median and the sample range, provides us with an estimate for μ/, that is easy to calculate. To calculate the distribution of h n is quite a different matter***. The distribution function of h1, and the density of h2 are given in section 1. Our results seem hardly promising for general hn. In section 2 it is shown that hn is asymptotically normal.
In the sequel we suppose μ= 0 and = 1, i.e. we consider only the "central" distribution. Note that hn can be used as a test statistic replacing Student's t. In that case the central hn is all that is needed.  相似文献   

We investigate the validity of the bootstrap method for the elementary symmetric polynomials S ( k ) n =( n k )−1Σ1≤ i 1< ... < i k ≤ n X i 1 ... X i k of i.i.d. random variables X 1, ..., X n . For both fixed and increasing order k , as n→∞ the cases where μ=E X 1[moe2]0, the nondegenerate case, and where μ=E X 1=0, the degenerate case, are considered.  相似文献   

Let X , X 1, ..., Xk be i.i.d. random variables, and for k ∈ N let Dk ( X ) = E ( X 1 V ... V X k +1) − EX be the k th centralized maximal moment. A sharp lower bound is given for D 1( X ) in terms of the Lévy concentration Ql ( X ) = sup x ∈ R P ( X ∈[ x , x + l ]). This inequality, which is analogous to P. Levy's concentration-variance inequality, illustrates the fact that maximal moments are a gauge of how much spread out the underlying distribution is. It is also shown that the centralized maximal moments are increased under convolution.  相似文献   

A new unbiased consistent asymptotically normal estimator U k of the intensity λ of a stationary multivariate Poisson point process is exhibited. This estimate is based on a combination of the j -th nearest neighbor (possibly non Euclidean) distances ( j =1, ..., k ) to a single fixed site x . A simple closed form containing logarithmic terms is obtained for E ( U l k )(0< l < k ).  相似文献   

Consider a sequence of random points placed on the nonnegative integers with i.i.d. geometric (1/2) interpoint spacings y i . Let x i denote the numbers of points placed at integer i . We prove a central limit theorem for the partial sums of the sequence x 0 y 0, x 1 y 1, . . . The problem is connected with a question concerning different bootstrap procedures.  相似文献   

Summary  Let x1…, xn be a sample from a distribution with infinite expectation, then for n →∞ the sample average n tends to +∞ with probability 1 (see [4]).
Sometimes n contains high jumps due to large observations. In this paper we consider samples from the "absolute Cauchy" distribution. In practice, on may consider the logarithm of the observations as a sample from a normal distribution. So we found in our simulation. After rejecting the log-normality assumption, one will be tempted to regard the extreme observations as outliers. It is shown that the discarding of the outlying observations gives an underestimation of the expectation, variance and 99 percentile of the actual distribution.  相似文献   

This paper gives an account of the collaboration between two mathematical statisticians and a toxicologist (the second author) interested in thin layer chromatography (TLC). A TLC "system" consists of a medium through which a solvent is transported. If a solution of some (toxic) sample is applied to the medium, then the components are carried forward by the solvent over different distances. Section 1 describes the concept of a data bank which provides standard values for the degrees of migration characteristic for each of m well-studied substances in each of n systems. Sections 2–5 are mainly devoted to the construction of The "best design(s)"{ j 1*… j k * } of k systems from the n available ones. The attention is restricted to the situation that an unidentified sample exclusively contains one of the m substances covered by the data bank and produces the scores xj … xjk in the systems j,… j krespectively. Three different approaches to the identification problem were successively considered. Each approach leads to a class of procedures and their performances. The performance of the optimum procedure can be used to define the performance of any of the ( nk ) designs ( j 1… jk }. The latter performance is maximized in order to determine { j 1*.,., jk* }. In practice usually data is obtained for mixtures instead of single. pure substances. Section 6 gives some tentative theory for the evaluation of such data.  相似文献   

We deal with general mixture of hierarchical models of the form m(x) = føf(x |θ) g (θ)dθ , where g(θ) and m(x) are called mixing and mixed or compound densities respectively, and θ is called the mixing parameter. The usual statistical application of these models emerges when we have data xi, i = 1,…,n with densities f(xii) for given θi, and the θ1 are independent with common density g(θ) . For a certain well known class of densities f(x |θ) , we present a sample-based approach to reconstruct g(θ) . We first provide theoretical results and then we use, in an empirical Bayes spirit, the first four moments of the data to estimate the first four moments of g(θ) . By using sampling techniques we proceed in a fully Bayesian fashion to obtain any posterior summaries of interest. Simulations which investigate the operating characteristics of our proposed methodology are presented. We illustrate our approach using data from mixed Poisson and mixed exponential densities.  相似文献   

Assume k ( k ≥ 2) independent populations π1, π2μk are given. The associated independent random variables Xi,( i = 1,2,… k ) are Logistically distributed with unknown means μ1, μ2, μk and equal variances. The goal is to select that population which has the largest mean. The procedure is to select that population which yielded the maximal sample value. Let μ(1)≤μ(2)≤…≤μ(k) denote the ordered means. The probability of correct selection has been determined for the Least Favourable Configuration μ(1)(2)==μ(k – 1)(k)–δ where δ > 0. An exact formula for the probability of correct selection is given.  相似文献   

Abstract Let X 1., X n1 and Y 1., Y n1, be two independent random samples from exponential populations. The statistical problem is to test whether or not two exponential populations are the same, based on the order statistics X [1],. X [r1] and Y [1],. Y [rs] where 1 r1 n 1 and 1 r2 n 2. A new test is given and an asymptotic optimum property of the test is proved.  相似文献   

Let ( Xn, n ≥ 1) be an i.i.d. sequence of positive random variables with distribution function H . Let φ H := {(n, Xn ), n ≥ 1) be the associated observation process. We view φ h as a measure on E := [0, ∞) ∞ (0, φ] where φH (A) is the number of points of φ H which lie in A . A family ( Vs, s> 0) of transformations is defined on E in such a way that for suitable H the distributions of ( VsφH, S > 0) satisfy a large deviation principle and that a related Strassen-type law of the iterated logarithm also holds. Some consequent large deviation principles and loglog laws are derived for extreme values. Similar results are proved for φ H replaced by certain planar Poisson processes.  相似文献   

Summary A group of n persons has to decide on one out of k alternatives. To achieve this end each pair of alternatives is put to a vote. It is assumed that each person ranks the k alternatives according to an individual preference scale and that on every vote between two alternatives he will vote for the alternative that occurs higher on his scale. If n is odd and an alternative obtains a majority on each of the ( k - 1) occasions on which it is put to a vote, the group will decide on that alternative. If no such winning alternative exists, a paradox of voting is said to occur. For even values of n the definition of a paradox is slightly more complicated.
On the assumption that the preference scales of the n persons are obtained by n independent random drawings from the k ! permutations of the numbers 1, 2,…, k , we discuss the computation of the probability of a paradox of voting P k,n. Values of P 3,n and P k= lim P k,n are given.  相似文献   

In this paper we investigate two-sample U -statistics in the case of clusters of repeated measurements observed on individuals from independent populations. The observations on the i -th individual in the first population are denoted by     , 1 ≤  i  ≤  m , and those on the k -th individual in the second population are denoted by     , 1 ≤  k  ≤  n . Given the kernel φ ( x ,  y ), we define the generalized two-sample U -statistic by
We derive the asymptotic distribution of U m , n for large sample sizes. As an application we study the generalized Mann–Whitney–Wilcoxon rank sum test for clustered data.  相似文献   

《Statistica Neerlandica》1963,17(3):299-317
Outlyer-ignoring estimators for measurement in duplo.
By hypothesis a measurement u is the sum of two independent random variables, the normal random variable with expectation μ, and standard error σ, and a random error φ:

Basically two independent measurements u1 and u2 over u are to give the estimate x=1/2(u1+ u2) over μ.
However, to reduce the effect of the error φ on a final estimate of μ, one adds, according to a common practice, a third or even a fourth measurement u3, u4, in the case that the basic pair differs by more than a number A. For this extended set of measurements two outlyer-ignoring estimator y and z of μ are defined, and investigated against three specifications fo the error φ. Also an outlyer-ignoring estimate of σ is considered, and its application is illustrated by an example.  相似文献   

《Statistica Neerlandica》1948,2(5-6):228-234
Summary  (Sample size for a single sampling scheme).
The operating characteristic of a sampling scheme may be specified by the producers 1 in 20 risk point ( p 1), at which the probability of rejecting a batch is 0.05, and the consumers 1 in 20 risk point ( p 2) at which the probability of accepting a batch of that quality is also 0.05.
A nomogram is given (fig. 2) to determine for single sampling schemes and for given values of p1 and p 2 the necessary sample size ( n ) and the allowable number of defectives in the sample ( c ).
The nomogram may reversedly be used to determine the producers and consumers 1 in 20 risk points for a given single sampling scheme.
The curves in this nomogram were computed from a table of percentage points of the χ2 distribution. For v > 30 Wilson and Hilferty's approximation to the χ2 distribution was used.  相似文献   

《Statistica Neerlandica》1960,22(2):103-118
Summary  A branch and bound algorithm is given to solve the following problem: To each pair of elements (i,j) from a set X ={l,…, n } a number r ij with r ij≥ 0, r ij= r ij and r ij= 0 has been assigned. Find a prescribed number of disjoint subsets P 1…, P m from X , such that

Experiments indicate that an optimal solution is usually found in a small number of iterations, but the verification may be rather time consuming.
The algorithm may be used to find the minimum value of m for which a partitioning of X with z = 0 exists. The algorithm appears to be efficient for finding this 'chromatic number of a graph'.  相似文献   

Laten T1 en T2 twee toetsen zijn voor dezelfde hypothese θ=θ0betreffende de waarde van een parameter θ, Zij verder de onbetrouwbaarheidsdrempel van beide toetsen gelijk aan α en het onderscheidingsvermogen tegen de alternatieve hypothese θ=θ1 geliik aan 1-β. Indien toets T1 nu n1 waarnemingen vergt en toets T2n2 waarnemingen, dan wordt de relatieve doeltreffendheid (Eng.: efficiency) van toets T1 ten opzichte van toets T2 (als toetsen voor θ=θ0 tegen θ=θ1 gegeven door: e = n2/n1. Indien men de waarde van θ1 op een bepaalde wijze naar θ0 laat convergeren bij toenemende n1, is het in vele gevallen, door gebruik te maken van een stelling van α en β Deze limiet-waarde wordt de asymptotische relatieve doeltreffendheid (volgens Pitman) genoemd. In dit artikel wordt een overzicht gegeven van hetgeen bekend is over de asymptotische relatieve doeltreffendheid van een aantal verdelingsvrije toetsen ten opzichte van de corresponderende standaardtoetsen.
De conclusie van de schrijver is, dat men bij het gebruik van verdelingsvrije methoden met een hoge doeltreffendheid (bijv. de symmetrietoets en de twee-steek-proeven-toets van Wilcoxon, de toets van Kruskal voor k steekproeven en de methode van m rangschikkingen) slechts zeer weinig informatie kan verliezen en dat zelfs het gebruik van minder doeltreffende verdelingsvrije methoden gerechtvaardigd kan zijn.  相似文献   

For a wide class of goodness-of-fit statistics based on φ-divergences between hypothetical cell probabilities and observed relative frequencies, the asymptotic normality is established under the assumption n / m n →γ∈(0,∞), where n denotes sample size and m n the number of cells. Related problems of asymptotic distributions of φ-divergence errors, and of φ-divergence deviations of histogram estimators from their expected values, are considered too.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号