Abstract: | We study the problem of parameter inference in (possibly non-linear and non-smooth) econometric models when the data are measured with error. We allow for arbitrary correlation between the true variables and the measurement errors. To solve the identification problem, we require the existence of an auxiliary data-set that contains information about the conditional distribution of the true variables given the mismeasured variables. Our main assumption requires that the conditional distribution of the true variables given the mismeasured variables is the same in the primary and auxiliary data. Our methods allow the auxiliary data to be a validation sample, where the primary and validation data are from the same distribution, and more importantly, a stratified sample where the auxiliary data-set is not from the same distribution as the primary data. We also show how to combine the two data-sets to obtain a more efficient estimator of the parameter of interest. We establish the large sample properties of the sieve based estimators under verifiable conditions. In particular, we allow for the mismeasured variables to have unbounded supports without employing the tedious trimming scheme typically used in kernel based methods. We illustrate our methods by estimating a returns to schooling censored quantile regression using the CPS/SSR 1978 exact match files where the dependent variable is measured with error of arbitrary kind. |