首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 828 毫秒
1.
Record linkage is the act of bringing together records from two files that are believed to belong to the same unit (e.g., a person or business). It is a low‐cost way of increasing the set of variables available for analysis. Errors may arise in the linking process if an error‐free unit identifier is not available. Two types of linking errors include an incorrect link (records belonging to two different units are linked) and a missed record (an unlinked record for which a correct link exists). Naively ignoring linkage errors may mean that analysis of the linked file is biased. This paper outlines a “weighting approach” to making correct inference about regression coefficients and population totals in the presence of such linkage errors. This approach is designed for analysts who do not have the expertise or time to use specialist software required by other approaches but who are comfortable using weights in inference. The performance of the estimator is demonstrated in a simulation study.  相似文献   

2.
Linkage errors can occur when probability‐based methods are used to link records from two distinct data sets corresponding to the same target population. Current approaches to modifying standard methods of regression analysis to allow for these errors only deal with the case of two linked data sets and assume that the linkage process is complete, that is, all records on the two data sets are linked. This study extends these ideas to accommodate the situation when more than two data sets are probabilistically linked and the linkage is incomplete.  相似文献   

3.
Computerised Record Linkage methods help us combine multiple data sets from different sources when a single data set with all necessary information is unavailable or when data collection on additional variables is time consuming and extremely costly. Linkage errors are inevitable in the linked data set because of the unavailability of error‐free unique identifiers. A small amount of linkage errors can lead to substantial bias and increased variability in estimating parameters of a statistical model. In this paper, we propose a unified theory for statistical analysis with linked data. Our proposed method, unlike the ones available for secondary data analysis of linked data, exploits record linkage process data as an alternative to taking a costly sample to evaluate error rates from the record linkage procedure. A jackknife method is introduced to estimate bias, covariance matrix and mean squared error of our proposed estimators. Simulation results are presented to evaluate the performance of the proposed estimators that account for linkage errors.  相似文献   

4.
Linking administrative, survey and census files to enhance dimensions such as time and breadth or depth of detail is now common. Because a unique person identifier is often not available, records belonging to two different units (e.g. people) may be incorrectly linked. Estimating the proportion of links that are correct, called Precision, is difficult because, even after clerical review, there will remain uncertainty about whether a link is in fact correct or incorrect. Measures of Precision are useful when deciding whether or not it is worthwhile linking two files, when comparing alternative linking strategies and as a quality measure for estimates based on the linked file. This paper proposes an estimator of Precision for a linked file that has been created by either deterministic (or rules‐based) or probabilistic (where evidence for a link being a match is weighted against the evidence that it is not a match) linkage, both of which are widely used in practice. This paper shows that the proposed estimators perform well.  相似文献   

5.
A basic concern in statistical disclosure limitation is the re-identification of individuals in anonymised microdata. Linking against a second dataset that contains identifying information can result in a breach of confidentiality. Almost all linkage approaches are based on comparing the values of variables that are common to both datasets. It is tempting to think that if datasets contain no common variables, then there can be no risk of re-identification. However, linkage has been attempted between such datasets via the extraction of structural information using ordered weighted averaging (OWA) operators. Although this approach has been shown to perform better than randomly pairing records, it is debatable whether it demonstrates a practically significant disclosure risk. This paper reviews some of the main aspects of statistical disclosure limitation. It then goes on to show that a relatively simple, supervised Bayesian approach can consistently outperform OWA linkage. Furthermore, the Bayesian approach demonstrates a significant risk of re-identification for the types of data considered in the OWA record linkage literature.  相似文献   

6.
Bootstrapping Financial Time Series   总被引:2,自引:0,他引:2  
It is well known that time series of returns are characterized by volatility clustering and excess kurtosis. Therefore, when modelling the dynamic behavior of returns, inference and prediction methods, based on independent and/or Gaussian observations may be inadequate. As bootstrap methods are not, in general, based on any particular assumption on the distribution of the data, they are well suited for the analysis of returns. This paper reviews the application of bootstrap procedures for inference and prediction of financial time series. In relation to inference, bootstrap techniques have been applied to obtain the sample distribution of statistics for testing, for example, autoregressive dynamics in the conditional mean and variance, unit roots in the mean, fractional integration in volatility and the predictive ability of technical trading rules. On the other hand, bootstrap procedures have been used to estimate the distribution of returns which is of interest, for example, for Value at Risk (VaR) models or for prediction purposes. Although the application of bootstrap techniques to the empirical analysis of financial time series is very broad, there are few analytical results on the statistical properties of these techniques when applied to heteroscedastic time series. Furthermore, there are quite a few papers where the bootstrap procedures used are not adequate.  相似文献   

7.
Here we consider the record data from the two-parameter of bathtub-shaped distribution. First, we develop simplified forms for the single moments, variances and covariance of records. These distributional properties are quite useful in obtaining the best linear unbiased estimators of the location and scale parameters which can be included in the model. The estimation of the unknown shape parameters and prediction of the future unobserved records based on some observed ones are discussed. Frequentist and Bayesian analyses are adopted for conducting the estimation and prediction problems. The likelihood method, moment based method, bootstrap methods as well as the Bayesian sampling techniques are applied for the inference problems. The point predictors and credible intervals of future record values based on an informative set of records can be developed. Monte Carlo simulations are performed to compare the so developed methods and one real data set is analyzed for illustrative purposes.  相似文献   

8.
Aspects of statistical analysis in DEA-type frontier models   总被引:2,自引:2,他引:2  
In Grosskopf (1995) and Banker (1995) different approaches and problems of statistical inference in DEA frontier models are presented. This paper focuses on the basic characteristics of DEA models from a statistical point of view. It arose from comments and discussions on both papers above. The framework of DEA models is deterministic (all the observed points lie on the same side of the frontier), nevertheless a stochastic model can be constructed once a data generating process is defined. So statistical analysis may be performed and sampling properties of DEA estimators can be established. However, practical statistical inference (such as test of hypothesis, confidence intervals) still needs artifacts like the bootstrap to be performed. A consistent bootstrap relies also on a clear definition of the data generating proces and on a consistent estimator of it: The approach of Simar and Wilson (1995) is described. Finally, some trails are proposed for introducing stochastic noise in DEA models, in the spirit of the Kneip-Simar (1995) approach.  相似文献   

9.
近年,河南制造业和物流业都获得了较快发展,但仍不尽如人意。协整理论分析表明,河南制造业和物流业联动发展水平较低。其根本原因在于供应链难协同,物流创新不足。为此应优化环境,建立联动机制,推动制造企业一体化物流和第三方物流业的整合与流程再造。  相似文献   

10.
A random sample drawn from a population would appear to offer an ideal opportunity to use the bootstrap in order to perform accurate inference, since the observations of the sample are IID. In this paper, Monte Carlo results suggest that bootstrapping a commonly used index of inequality leads to inference that is not accurate even in very large samples, although inference with poverty indices is satisfactory. We find that the major cause is the extreme sensitivity of many inequality indices to the exact nature of the upper tail of the income distribution. This leads us to study two non-standard bootstraps, the m out of n bootstrap, which is valid in some situations where the standard bootstrap fails, and a bootstrap in which the upper tail is modelled parametrically. Monte Carlo results suggest that accurate inference can be achieved with this last method in moderately large samples.  相似文献   

11.
The paper investigates the usefulness of bootstrap methods for small sample inference in cointegrating regression models. It discusses the standard bootstrap, the recursive bootstrap, the moving block bootstrap and the stationary bootstrap methods. Some guidelines for bootstrap data generation and test statistics to consider are provided and some simulation evidence presented suggests that the bootstrap methods, when properly implemented, can provide significant improvement over asymptotic inference.  相似文献   

12.
《Journal of econometrics》2002,108(2):317-342
This paper proposes the use of the bootstrap for the most commonly applied procedures in inequality, mobility and poverty measurement. In addition to simple inequality index estimation the scenarios considered are inequality difference tests for correlated data, decompositions by sub-group or income source, decompositions of inequality changes, and mobility index and poverty index estimation. Besides showing the consistency of the bootstrap for these scenarios, the paper also develops simple ways to deal with longitudinal correlation and panel attrition or non-response. In principle, all the proposed procedures can be handled by the δ-method, but Monte Carlo evidence suggests that the simplest possible bootstrap procedure should be the preferred method in practice, as it achieves the same accuracy as the δ-method and takes into account the stochastic dependencies in the data without explicitly having to deal with its covariance structure. If a variance estimate is available, then the studentized version of the bootstrap may lead to an improvement in accuracy, but substantially so only for relatively small sample sizes. All results incorporate the possibility that different observations have different sampling weights.  相似文献   

13.
邓基刚 《价值工程》2013,(33):133-135
本文通过构建卷烟销售异常联动管理机制影响因素的结构方程模型,以某市烟草公司为实证研究对象,并展开问卷调查获取数据,运用AMOS软件对该模型进行验证性因子分析,获得关键影响因素。基于实证研究结果,提出提升卷烟销售异常联动管理机制效率的对策。  相似文献   

14.
Through Monte Carlo experiments the effects of a feedback mechanism on the accuracy in finite samples of ordinary and bootstrap inference procedures are examined in stable first- and second-order autoregressive distributed-lag models with non-stationary weakly exogenous regressors. The Monte Carlo is designed to mimic situations that are relevant when a weakly exogenous policy variable affects (and is affected by) the outcome of agents’ behaviour. In the parameterizations we consider, it is found that small-sample problems undermine ordinary first-order asymptotic inference procedures irrespective of the presence and importance of a feedback mechanism. We examine several residual-based bootstrap procedures, each of them designed to reduce one or several specific types of bootstrap approximation error. Surprisingly, the bootstrap procedure which only incorporates the conditional model overcomes the small sample problems reasonably well. Often (but not always) better results are obtained if the bootstrap also resamples the marginal model for the policymakers’ behaviour.  相似文献   

15.
Subsampling and the m out of n bootstrap have been suggested in the literature as methods for carrying out inference based on post-model selection estimators and shrinkage estimators. In this paper we consider a subsampling confidence interval (CI) that is based on an estimator that can be viewed either as a post-model selection estimator that employs a consistent model selection procedure or as a super-efficient estimator. We show that the subsampling CI (of nominal level 1−α for any α(0,1)) has asymptotic confidence size (defined to be the limit of finite-sample size) equal to zero in a very simple regular model. The same result holds for the m out of n bootstrap provided m2/n→0 and the observations are i.i.d. Similar zero-asymptotic-confidence-size results hold in more complicated models that are covered by the general results given in the paper and for super-efficient and shrinkage estimators that are not post-model selection estimators. Based on these results, subsampling and the m out of n bootstrap are not recommended for obtaining inference based on post-consistent model selection or shrinkage estimators.  相似文献   

16.
ABSTRACT This paper investigates through Monte Carlo experiments both size and power properties of a bootstrapped trace statistic in two prototypical DGPs. The Monte Carlo results indicate that the ordinary bootstrap has similar size and power properties as inference procedures based on asymptotic critical values. Considering empirical size, the stationary bootstrap is found to provide a uniform improvement over the ordinary bootstrap if the dynamics is underspecified. The use of the stationary bootstrap as a diagnostic tool is suggested. In two illustrative examples this seems to work, and again it appears that the bootstrap incorporates the finite-sample correction required for the asymptotic critical values to apply.  相似文献   

17.
This paper presents results from a Monte Carlo study concerning inference with spatially dependent data. We investigate the impact of location/distance measurement errors upon the accuracy of parametric and nonparametric estimators of asymptotic variances. Nonparametric estimators are quite robust to such errors, method of moments estimators perform surprisingly well, and MLE estimators are very poor. We also present and evaluate a specification test based on a parametric bootstrap that has good power properties for the types of measurement error we consider.  相似文献   

18.
In this paper, we propose a fixed design wild bootstrap procedure to test parameter restrictions in vector autoregressive models, which is robust in cases of conditionally heteroskedastic error terms. The wild bootstrap does not require any parametric specification of the volatility process and takes contemporaneous error correlation implicitly into account. Via a Monte Carlo investigation, empirical size and power properties of the method are illustrated for the case of white noise under the null hypothesis. We compare the bootstrap approach with standard ordinary least squares (OLS)-based, weighted least squares (WLS) and quasi-maximum likelihood (QML) approaches. In terms of empirical size, the proposed method outperforms competing approaches and achieves size-adjusted power close to WLS or QML inference. A White correction of standard OLS inference is satisfactory only in large samples. We investigate the case of Granger causality in a bivariate system of inflation expectations in France and the United Kingdom. Our evidence suggests that the former are Granger causal for the latter while for the reverse relation Granger non-causality cannot be rejected.  相似文献   

19.
非公允关联交易在我国上市公司中普遍存在,并对上市公司的业绩产生着重大影响。本文在分析上市公司非公允关联交易产生原因的基础上,从完善立法、首发审核、治理结构、信息披露、定价方法、独立审计等方面探讨了非公允关联交易的治理对策。  相似文献   

20.
Many papers have regressed non-parametric estimates of productive efficiency on environmental variables in two-stage procedures to account for exogenous factors that might affect firms’ performance. None of these have described a coherent data-generating process (DGP). Moreover, conventional approaches to inference employed in these papers are invalid due to complicated, unknown serial correlation among the estimated efficiencies. We first describe a sensible DGP for such models. We propose single and double bootstrap procedures; both permit valid inference, and the double bootstrap procedure improves statistical efficiency in the second-stage regression. We examine the statistical performance of our estimators using Monte Carlo experiments.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号