共查询到20条相似文献,搜索用时 0 毫秒
1.
J. J. A. Moors R. Th. A. Wagemakers V. M. J. Coenen R. M. J. Heuts M. J. B. T. Janssens 《Statistica Neerlandica》1996,50(3):417-430
Modelling an empirical distribution by means of a simple theoretical distribution is an interesting issue in applied statistics. A reasonable first step in this modelling process is to demand that measures for location, dispersion, skewness and kurtosis for the two distributions coincide. Up to now, the four measures used hereby were based on moments.
In this paper measures are considered which are based on quantiles. Of course, the four values of these quantile measures do not uniquely determine the modelling distribution. They do, however, within specific systems of distributions, like Pearson's or Johnson's; they share this property with the four moment-based measures.
This opens the possibility of modelling an empirical distribution—within a specific system—by means of quantile measures. Since moment-based measures are sensitive to outliers, this approach may lead to a better fit. Further, tests of fit—e.g. a test for normality—may be constructed based on quantile measures. In view of the robustness property, these tests may achieve higher power than the classical moment-based tests.
For both applications the limiting joint distribution of quantile measures will be needed; they are derived here as well. 相似文献
In this paper measures are considered which are based on quantiles. Of course, the four values of these quantile measures do not uniquely determine the modelling distribution. They do, however, within specific systems of distributions, like Pearson's or Johnson's; they share this property with the four moment-based measures.
This opens the possibility of modelling an empirical distribution—within a specific system—by means of quantile measures. Since moment-based measures are sensitive to outliers, this approach may lead to a better fit. Further, tests of fit—e.g. a test for normality—may be constructed based on quantile measures. In view of the robustness property, these tests may achieve higher power than the classical moment-based tests.
For both applications the limiting joint distribution of quantile measures will be needed; they are derived here as well. 相似文献
2.
The assumption of normality has underlain much of the development of statistics, including spatial statistics, and many tests have been proposed. In this work, we focus on the multivariate setting and first review the recent advances in multivariate normality tests for i.i.d. data, with emphasis on the skewness and kurtosis approaches. We show through simulation studies that some of these tests cannot be used directly for testing normality of spatial data. We further review briefly the few existing univariate tests under dependence (time or space), and then propose a new multivariate normality test for spatial data by accounting for the spatial dependence. The new test utilises the union-intersection principle to decompose the null hypothesis into intersections of univariate normality hypotheses for projection data, and it rejects the multivariate normality if any individual hypothesis is rejected. The individual hypotheses for univariate normality are conducted using a Jarque–Bera type test statistic that accounts for the spatial dependence in the data. We also show in simulation studies that the new test has a good control of the type I error and a high empirical power, especially for large sample sizes. We further illustrate our test on bivariate wind data over the Arabian Peninsula. 相似文献
3.
采用两种方法对传统的期权定价参数方法进行修正。一种是利用股票对数收益率的偏度与峰度对传统的期权定价方法计算出的期权价格进行修正,另一种是通过建立GARCH模型来预测股票收益的波动率,对传统定价方法中波动率为常数的假设进行修正。选取国电CWB1(580022)权证进行实证分析,结果表明修正得出的期权价格与实际的权证价格有很大的偏差,并对这样实证结果进行解释。 相似文献
4.
This paper is a review of many of the dozens of procedures currently available for testing a data set for goodness-of-fit to the multivariate normal distribution. A majority of the procedures can be placed into one of four basic categories. Most procedures are multivariate extensions or adaptations of procedures used for testing univariate normality. Results of several power studies are summarized, and an extensive bibliography of literature pertaining to testing for multivariate normality is provided. 相似文献
5.
Jaap P.L. Brand Stef van Buuren Karin Groothuis-Oudshoorn Edzard S. Gelsema† 《Statistica Neerlandica》2003,57(1):36-45
This paper outlines a strategy to validate multiple imputation methods. Rubin's criteria for proper multiple imputation are the point of departure. We describe a simulation method that yields insight into various aspects of bias and efficiency of the imputation process. We propose a new method for creating incomplete data under a general Missing At Random (MAR) mechanism. Software implementing the validation strategy is available as a SAS/IML module. The method is applied to investigate the behavior of polytomous regression imputation for categorical data. 相似文献
6.
Katarzyna Kopczewska 《Statistica Neerlandica》2014,68(4):251-266
The paper deals with the statistical modeling of convergence and cohesion over time with the use of kurtosis, skewness and L‐moments. Changes in the shape of the distribution related to the spatial allocation of socio‐economic phenomena are considered as an evidence of global shift, divergence or convergence. Cross‐sectional time‐series statistical modeling of variables of interest is to overpass the minors of econometric theoretical models of convergence and cohesion determinants. L‐moments perform much more stable and interpretable than classical measures. Empirical evidence of panel data proves that one pure pattern (global shift, polarization or cohesion) rarely exists and joint analysis is required. 相似文献
7.
List augmentation with model based multiple imputation: a case study using a mixed-outcome factor model 总被引:1,自引:0,他引:1
This study concerns list augmentation in direct marketing. List augmentation is a special case of missing data imputation. We review previous work on the mixed outcome factor model and apply it for the purpose of list augmentation. The model deals with both discrete and continuous variables and allows us to augment the data for all subjects in a company's transaction database with soft data collected in a survey among a sample of those subjects. We propose a bootstrap-based imputation approach, which is appealing to use in combination with the factor model, since it allows one to include estimation uncertainty in the imputation procedure in a simple, yet adequate manner. We provide an empirical case study of the performance of the approach to a transaction data base of a bank. 相似文献
8.
The most common way for treating item non‐response in surveys is to construct one or more replacement values to fill in for a missing value. This process is known as imputation. We distinguish single from multiple imputation. Single imputation consists of replacing a missing value by a single replacement value, whereas multiple imputation uses two or more replacement values. This article reviews various imputation procedures used in National Statistical Offices as well as the properties of point and variance estimators in the presence of imputed survey data. It also provides the reader with newer developments in the field. 相似文献
9.
This paper discusses the importance of managing data quality in academic research in its relation to satisfying the customer. This focus is on the data completeness objectivedimension of data quality in relation to recent advancements which have been made in the development of methods for analysing incomplete multivariate data. An overview and comparison of the traditional techniques with the recent advancements are provided. Multiple imputation is also discussed as a method of analysing incomplete multivariate data, which can potentially reduce some of the biases which can occur from using some of the traditional techniques. Despite these recent advancements in the analysis of incomplete multivariate data, evidence is presented which shows that researchers are not using these techniques to manage the data quality of their current research across a variety of academic disciplines. An analysis is then provided as to why these techniques have not been adopted along with suggestions to improve the frequency of their use in the future.
Source-Reference. The ideas for this paper originated from research work on David J. Fogarty's Ph.D. dissertation. The subject area is the use of advanced techniques for the imputation of incomplete multivariate data on corporate data warehouses. 相似文献
10.
We examine the impact of higher order moments of changes in the exchange rate on stock returns of U.S. large-cap companies in the S&P500. We find a robust negative effect of exchange rate volatility on S&P500 company returns. The consumer discretionary and the consumer staples sectors have significant negative exposure to exchange rate volatility suggesting that exchange rate volatility affects stock returns through the channel of international operations. In terms of industries, the household products and personal products industries have significant negative exposure as well. The impact in the financial sector suggests that derivatives and hedging activity can mitigate exposure to exchange rate volatility. We find weak evidence that exchange rate skewness has an effect on S&P500 stock returns, but, find evidence that exchange rate kurtosis affects returns of companies that are more exposed to exchange rate volatility. 相似文献
11.
Empirical count data are often zero‐inflated and overdispersed. Currently, there is no software package that allows adequate imputation of these data. We present multiple‐imputation routines for these kinds of count data based on a Bayesian regression approach or alternatively based on a bootstrap approach that work as add‐ons for the popular multiple imputation by chained equations (mice ) software in R (van Buuren and Groothuis‐Oudshoorn , Journal of Statistical Software, vol. 45, 2011, p. 1). We demonstrate in a Monte Carlo simulation that our procedures are superior to currently available count data procedures. It is emphasized that thorough modeling is essential to obtain plausible imputations and that model mis‐specifications can bias parameter estimates and standard errors quite noticeably. Finally, the strengths and limitations of our procedures are discussed, and fruitful avenues for future theory and software development are outlined. 相似文献
12.
This paper analyzes regional price differentials in Poland at the NUTS-2 and NUTS-3 levels. It applies unique raw-price data and calculates regional purchasing power parity (PPP) deflators for the 16 NUTS-2 regions. It then estimates PPP deflators for the 66 NUTS-3-level regions by applying the multiple imputation approach. Finally, it verifies whether these are intra- or interregional price inequalities that have a greater influence on the overall price inequality level. It is found that the price levels are significantly higher than the average in the better-developed regions and lower in the lagging ones. It is also found that it is the intra- rather than the interregion differentials that influence more the overall inequality level. 相似文献
13.
Hot deck imputation is a method for handling missing data in which each missing value is replaced with an observed response from a \"similar\" unit. Despite being used extensively in practice, the theory is not as well developed as that of other imputation methods. We have found that no consensus exists as to the best way to apply the hot deck and obtain inferences from the completed data set. Here we review different forms of the hot deck and existing research on its statistical properties. We describe applications of the hot deck currently in use, including the U.S. Census Bureau's hot deck for the Current Population Survey (CPS). We also provide an extended example of variations of the hot deck applied to the third National Health and Nutrition Examination Survey (NHANES III). Some potential areas for future research are highlighted. 相似文献
14.
The missing data problem has been widely addressed in the literature. The traditional methods for handling missing data may be not suited to spatial data, which can exhibit distinctive structures of dependence and/or heterogeneity. As a possible solution to the spatial missing data problem, this paper proposes an approach that combines the Bayesian Interpolation method [Benedetti, R. & Palma, D. (1994) Markov random field-based image subsampling method, Journal of Applied Statistics, 21(5), 495–509] with a multiple imputation procedure. The method is developed in a univariate and a multivariate framework, and its performance is evaluated through an empirical illustration based on data related to labour productivity in European regions. 相似文献
15.
Susanne Rässler 《Statistica Neerlandica》2003,57(1):58-74
Data fusion or statistical matching techniques merge datasets from different survey samples to achieve a complete but artificial data file which contains all variables of interest. The merging of datasets is usually done on the basis of variables common to all files, but traditional methods implicitly assume conditional independence between the variables never jointly observed given the common variables. Therefore we suggest using model based approaches tackling the data fusion task by more flexible procedures. By means of suitable multiple imputation techniques, the identification problem which is inherent in statistical matching is reflected. Here a non-iterative Bayesian version of Rubin's implicit regression model is presented and compared in a simulation study with imputations from a data augmentation algorithm as well as an iterative approach using chained equations. 相似文献
16.
Geert Molenberghs Herbert Thijs Michael G. Kenward Geert Verbeke 《Statistica Neerlandica》2003,57(1):112-135
Even though models for incomplete longitudinal data are in common use, they are surrounded with problems, largely due to the untestable nature of the assumptions one has to make regarding the missingness mechanism. Two extreme views on how to deal with this problem are (1) to avoid incomplete data altogether and (2) to construct ever more complicated joint models for the measurement and missingness processes. In this paper, it is argued that a more versatile approach is to embed the treatment of incomplete data within a sensitivity analysis. Several such sensitivity analysis routes are presented and applied to a case study, the milk protein trial analyzed before by Diggle and Kenward (1994) . Apart from the use of local influence methods, some emphasis is put on pattern-mixture modeling. In the latter case, it is shown how multiple-imputation ideas can be used to define a practically feasible modeling strategy. 相似文献
17.
Gerko Vink Laurence E. Frank Jeroen Pannekoek Stef van Buuren 《Statistica Neerlandica》2014,68(1):61-90
Multiple imputation methods properly account for the uncertainty of missing data. One of those methods for creating multiple imputations is predictive mean matching (PMM), a general purpose method. Little is known about the performance of PMM in imputing non‐normal semicontinuous data (skewed data with a point mass at a certain value and otherwise continuously distributed). We investigate the performance of PMM as well as dedicated methods for imputing semicontinuous data by performing simulation studies under univariate and multivariate missingness mechanisms. We also investigate the performance on real‐life datasets. We conclude that PMM performance is at least as good as the investigated dedicated methods for imputing semicontinuous data and, in contrast to other methods, is the only method that yields plausible imputations and preserves the original data distributions. 相似文献
18.
Since the work of Little and Rubin (1987) not substantial advances in the analysisof explanatory regression models for incomplete data with missing not at randomhave been achieved, mainly due to the difficulty of verifying the randomness ofthe unknown data. In practice, the analysis of nonrandom missing data is donewith techniques designed for datasets with random or completely random missingdata, as complete case analysis, mean imputation, regression imputation, maximumlikelihood or multiple imputation. However, the data conditions required to minimizethe bias derived from an incorrect analysis have not been fully determined. In thepresent work, several Monte Carlo simulations have been carried out to establishthe best strategy of analysis for random missing data applicable in datasets withnonrandom missing data. The factors involved in simulations are sample size,percentage of missing data, predictive power of the imputation model and existenceof interaction between predictors. The results show that the smallest bias is obtainedwith maximum likelihood and multiple imputation techniques, although with lowpercentages of missing data, absence of interaction and high predictive power ofthe imputation model (frequent data structures in research on child and adolescentpsychopathology) acceptable results are obtained with the simplest regression imputation. 相似文献
19.
Since the introduction of the Autoregressive Conditional Heteroscedasticity (ARCH) model, the literature on modeling the time-varying second-order conditional moment has become increasingly popular in the last four decades. Its popularity is partly due to its success in capturing volatility in financial time series, which is useful for modeling and predicting risk for financial assets. A natural extension of this is to model time variation in higher-order conditional moments, such as the third and fourth moments, which are related to skewness and kurtosis (tail risk). This leads to an emerging literature on time-varying higher-order conditional moments in the last two decades. This paper outlines recent developments in modeling time-varying higher-order conditional moments in the economics and finance literature. Using the Generalized Autoregressive Conditional Heteroscedasticity (GARCH) framework as a foundation, this paper provides an overview of the two most common approaches for modeling time-varying higher-order conditional moments: autoregressive conditional density (ARCD) and autoregressive conditional moment (ARCM). The discussion covers both the theoretical and empirical aspects of the literature. This includes the identification of the associated skewness–kurtosis domain by using the solutions to the classical moment problems, the structural and statistical properties of the models used to model the higher-order conditional moments and the computational challenges in estimating these models. We also advocate the use of a maximum entropy density (MED) as an alternative method, which circumvents some of the issues prevalent in these common approaches. 相似文献
20.
Eric Schulte Nordholt 《Revue internationale de statistique》1998,66(2):157-180
When conducting surveys, two kinds of nonresponse may cause incomplete data files: unit nonresponse (complete nonresponse) and item nonresponse (partial nonresponse). The selectivity of the unit nonresponse is often corrected for. Various imputation techniques can be used for the missing values because of item nonresponse. Several of these imputation techniques are discussed in this report. One is the hot deck imputation. This paper describes two simulation experiments of the hot deck method. In the first study, data are randomly generated, and various percentages of missing values are then non-randomly'added'to the data. The hot deck method is used to reconstruct the data in this Monte Carlo experiment. The performance of the method is evaluated for the means, standard deviations, and correlation coefficients and compared with the available case method. In the second study, the quality of an imputation method is studied by running a simulation experiment. A selection of the data of the Dutch Housing Demand Survey is perturbed by leaving out specific values on a variable. Again hot deck imputations are used to reconstruct the data. The imputations are then compared with the true values. In both experiments the conclusion is that the hot deck method generally performs better than the available case method. This paper also deals with the questions which variables should be imputed and what the duration of the imputation process is. Finally the theory is illustrated by the imputation approaches of the Dutch Housing Demand Survey, the European Community Household Panel Survey (ECHP) and the new Dutch Structure of Earnings Survey (SES). These examples illustrate the levels of missing data that can be experienced in such surveys and the practical problems associated with choosing an appropriate imputation strategy for key items from each survey. 相似文献