首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 796 毫秒
1.
Penalized Regression with Ordinal Predictors   总被引:1,自引:0,他引:1  
Ordered categorial predictors are a common case in regression modelling. In contrast to the case of ordinal response variables, ordinal predictors have been largely neglected in the literature. In this paper, existing methods are reviewed and the use of penalized regression techniques is proposed. Based on dummy coding two types of penalization are explicitly developed; the first imposes a difference penalty, the second is a ridge type refitting procedure. Also a Bayesian motivation is provided. The concept is generalized to the case of non-normal outcomes within the framework of generalized linear models by applying penalized likelihood estimation. Simulation studies and real world data serve for illustration and to compare the approaches to methods often seen in practice, namely simple linear regression on the group labels and pure dummy coding. Especially the proposed difference penalty turns out to be highly competitive.  相似文献   

2.
This study examined the performance of two alternative estimation approaches in structural equation modeling for ordinal data under different levels of model misspecification, score skewness, sample size, and model size. Both approaches involve analyzing a polychoric correlation matrix as well as adjusting standard error estimates and model chi-squared, but one estimates model parameters with maximum likelihood and the other with robust weighted least-squared. Relative bias in parameter estimates and standard error estimates, Type I error rate, and empirical power of the model test, where appropriate, were evaluated through Monte Carlo simulations. These alternative approaches generally provided unbiased parameter estimates when the model was correctly specified. They also provided unbiased standard error estimates and adequate Type I error control in general unless sample size was small and the measured variables were moderately skewed. Differences between the methods in convergence problems and the evaluation criteria, especially under small sample and skewed variable conditions, were discussed.  相似文献   

3.
This paper incorporates text data from MLS listings into a hedonic pricing model. We show that the comments section of the MLS, which is populated by real estate agents who arguably have the most local market knowledge and know what homebuyers value, provides information that improves the performance of both in‐sample and out‐of‐sample pricing estimates. Text is found to decrease pricing error by more than 25%. Information from text is incorporated into a linear model using a tokenization approach. By doing so, the implicit prices for various words and phrases are estimated. The estimation focuses on simultaneous variable selection and estimation for linear models in the presence of a large number of variables using a penalized regression. The LASSO procedure and variants are shown to outperform least‐squares in out‐of‐sample testing. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

4.
As a result of novel data collection technologies, it is now common to encounter data in which the number of explanatory variables collected is large, while the number of variables that actually contribute to the model remains small. Thus, a method that can identify those variables with impact on the model without inferring other noneffective ones will make analysis much more efficient. Many methods are proposed to resolve the model selection problems under such circumstances, however, it is still unknown how large a sample size is sufficient to identify those “effective” variables. In this paper, we apply sequential sampling method so that the effective variables can be identified efficiently, and the sampling is stopped as soon as the “effective” variables are identified and their corresponding regression coefficients are estimated with satisfactory accuracy, which is new to sequential estimation. Both fixed and adaptive designs are considered. The asymptotic properties of estimates of the number of effective variables and their coefficients are established, and the proposed sequential estimation procedure is shown to be asymptotically optimal. Simulation studies are conducted to illustrate the performance of the proposed estimation method, and a diabetes data set is used as an example.  相似文献   

5.
In this article, we propose new Monte Carlo methods for computing a single marginal likelihood or several marginal likelihoods for the purpose of Bayesian model comparisons. The methods are motivated by Bayesian variable selection, in which the marginal likelihoods for all subset variable models are required to compute. The proposed estimates use only a single Markov chain Monte Carlo (MCMC) output from the joint posterior distribution and it does not require the specific structure or the form of the MCMC sampling algorithm that is used to generate the MCMC sample to be known. The theoretical properties of the proposed method are examined in detail. The applicability and usefulness of the proposed method are demonstrated via ordinal data probit regression models. A real dataset involving ordinal outcomes is used to further illustrate the proposed methodology.  相似文献   

6.
The forecast of the real estate market is an important part of studying the Chinese economic market. Most existing methods have strict requirements on input variables and are complex in parameter estimation. To obtain better prediction results, a modified Holt's exponential smoothing (MHES) method was proposed to predict the housing price by using historical data. Unlike the traditional exponential smoothing models, MHES sets different weights on historical data and the smoothing parameters depend on the sample size. Meanwhile, the proposed MHES incorporates the whale optimization algorithm (WOA) to obtain the optimal parameters. Housing price data from Kunming, Changchun, Xuzhou and Handan were used to test the performance of the model. The housing prices results of four cities indicate that the proposed method has a smaller prediction error and shorter computation time than that of other traditional models. Therefore, WOA-MHES can be applied efficiently to housing price forecasting and can be a reliable tool for market investors and policy makers.  相似文献   

7.
The classical exploratory factor analysis (EFA) finds estimates for the factor loadings matrix and the matrix of unique factor variances which give the best fit to the sample correlation matrix with respect to some goodness-of-fit criterion. Common factor scores can be obtained as a function of these estimates and the data. Alternatively to the classical EFA, the EFA model can be fitted directly to the data which yields factor loadings and common factor scores simultaneously. Recently, new algorithms were introduced for the simultaneous least squares estimation of all EFA model unknowns. The new methods are based on the numerical procedure for singular value decomposition of matrices and work equally well when the number of variables exceeds the number of observations. This paper provides an account that is intended as an expository review of methods for simultaneous parameter estimation in EFA. The methods are illustrated on Harman's five socio-economic variables data and a high-dimensional data set from genome research.  相似文献   

8.
In this article, we merge two strands from the recent econometric literature. First, factor models based on large sets of macroeconomic variables for forecasting, which have generally proven useful for forecasting. However, there is some disagreement in the literature as to the appropriate method. Second, forecast methods based on mixed‐frequency data sampling (MIDAS). This regression technique can take into account unbalanced datasets that emerge from publication lags of high‐ and low‐frequency indicators, a problem practitioner have to cope with in real time. In this article, we introduce Factor MIDAS, an approach for nowcasting and forecasting low‐frequency variables like gross domestic product (GDP) exploiting information in a large set of higher‐frequency indicators. We consider three alternative MIDAS approaches (basic, smoothed and unrestricted) that provide harmonized projection methods that allow for a comparison of the alternative factor estimation methods with respect to nowcasting and forecasting. Common to all the factor estimation methods employed here is that they can handle unbalanced datasets, as typically faced in real‐time forecast applications owing to publication lags. In particular, we focus on variants of static and dynamic principal components as well as Kalman filter estimates in state‐space factor models. As an empirical illustration of the technique, we use a large monthly dataset of the German economy to nowcast and forecast quarterly GDP growth. We find that the factor estimation methods do not differ substantially, whereas the most parsimonious MIDAS projection performs best overall. Finally, quarterly models are in general outperformed by the Factor MIDAS models, which confirms the usefulness of the mixed‐frequency techniques that can exploit timely information from business cycle indicators.  相似文献   

9.
A MODEL OF GROWTH AUGMENTED WITH INSTITUTIONS   总被引:1,自引:0,他引:1  
This paper shows that the inclusion of institutional indicators in a traditional model of growth substantially improves its explanatory capacity. The results have implications for economic policy, because not all the dimensions of institutional quality influence growth to the same extent. A large sample of 165 countries and estimation methods with instrumental variables are used to solve endogeneity problems.  相似文献   

10.
On social surveysdon't knows are a common answer to attitudinal questions, which often have binary or ordinal response categories.Don't knows can be nonrandomly selected according to certain demographic or socioeconomic characteristics of the respondent. To model the sample selection and correct for its bias, this paper discusses two types of bivariate models —binary-probit and the ordinal probit model with sample selection. The difference between parameter estimates and predicted probabilities from the analysis modelling the sample selection bias ofdon't knows and those from the analysis not modellingdon't knows is emphasized. Two empirical examples using the 1989 General Social Survey data demonstrate the necessity to correct for the bias in the nonrandom selection ofdon't knows for binary and ordinal attitudinal response variables. A replication of the analyses using the 1990 and 1991 General Social Survey data helps demonstrate the reliability of the sample selection bias ofdon't knows.  相似文献   

11.
This paper considers factor estimation from heterogeneous data, where some of the variables—the relevant ones—are informative for estimating the factors, and others—the irrelevant ones—are not. We estimate the factor model within a Bayesian framework, specifying a sparse prior distribution for the factor loadings. Based on identified posterior factor loading estimates, we provide alternative methods to identify relevant and irrelevant variables. Simulations show that both types of variables are identified quite accurately. Empirical estimates for a large multi‐country GDP dataset and a disaggregated inflation dataset for the USA show that a considerable share of variables is irrelevant for factor estimation.  相似文献   

12.
We consider improved estimation strategies for the parameter matrix in multivariate multiple regression under a general and natural linear constraint. In the context of two competing models where one model includes all predictors and the other restricts variable coefficients to a candidate linear subspace based on prior information, there is a need of combining two estimation techniques in an optimal way. In this scenario, we suggest some shrinkage estimators for the targeted parameter matrix. Also, we examine the relative performances of the suggested estimators in the direction of the subspace and candidate subspace restricted type estimators. We develop a large sample theory for the estimators including derivation of asymptotic bias and asymptotic distributional risk of the suggested estimators. Furthermore, we conduct Monte Carlo simulation studies to appraise the relative performance of the suggested estimators with the classical estimators. The methods are also applied on a real data set for illustrative purposes.  相似文献   

13.
We develop a behavioral asset pricing model in which agents trade in a market with information friction. Profit‐maximizing agents switch between trading strategies in response to dynamic market conditions. Owing to noisy private information about the fundamental value, the agents form different evaluations about heterogeneous strategies. We exploit a thin set—a small sub‐population—to point identify this nonlinear model, and estimate the structural parameters using extended method of moments. Based on the estimated parameters, the model produces return time series that emulate the moments of the real data. These results are robust across different sample periods and estimation methods.  相似文献   

14.
Abstract There is a plethora of time series measures of uncertainty for inflation and real output growth in empirical studies but little is known whether they are comparable to the uncertainty measure reported by individual forecasters in the survey of professional forecasters. Are these two measures of uncertainty inherently distinct? This paper shows that, compared with many uncertainty proxies produced by time series models, the use of real‐time data with fixed‐sample recursive estimation of an asymmetric bivariate generalized autoregressive conditional heteroskedasticity model yields inflation uncertainty estimates which resemble the survey measure. There is, however, overwhelming evidence that many of the time series measures of growth uncertainty exceed the level of uncertainty obtained from survey measure. Our results highlight the relative merits of using different methods in modelling macroeconomic uncertainty which are useful for empirical researchers.  相似文献   

15.
Surveys usually include questions where individuals must select one in a series of possible options that can be sorted. On the other hand, multiple frame surveys are becoming a widely used method to decrease bias due to undercoverage of the target population. In this work, we propose statistical techniques for handling ordinal data coming from a multiple frame survey using complex sampling designs and auxiliary information. Our aim is to estimate proportions when the variable of interest has ordinal outcomes. Two estimators are constructed following model‐assisted generalised regression and model calibration techniques. Theoretical properties are investigated for these estimators. Simulation studies with different sampling procedures are considered to evaluate the performance of the proposed estimators in finite size samples. An application to a real survey on opinions towards immigration is also included.  相似文献   

16.
Forecasting economic and financial variables with global VARs   总被引:1,自引:0,他引:1  
This paper considers the problem of forecasting economic and financial variables across a large number of countries in the global economy. To this end a global vector autoregressive (GVAR) model, previously estimated by Dees, di Mauro, Pesaran, and Smith (2007) and Dees, Holly, Pesaran, and Smith (2007) over the period 1979Q1–2003Q4, is used to generate out-of-sample forecasts one and four quarters ahead for real output, inflation, real equity prices, exchange rates and interest rates over the period 2004Q1–2005Q4. Forecasts are obtained for 134 variables from 26 regions, which are made up of 33 countries and cover about 90% of the world output. The forecasts are compared to typical benchmarks: univariate autoregressive and random walk models. Building on the forecast combination literature, the effects of model and estimation uncertainty on forecast outcomes are examined by pooling forecasts obtained from different GVAR models estimated over alternative sample periods. Given the size of the modelling problem, and the heterogeneity of the economies considered–industrialised, emerging, and less developed countries–as well as the very real likelihood of possibly multiple structural breaks, averaging forecasts across both models and windows makes a significant difference. Indeed, the double-averaged GVAR forecasts perform better than the benchmark competitors, especially for output, inflation and real equity prices.  相似文献   

17.
We discuss structural equation models for non-normal variables. In this situation the maximum likelihood and the generalized least-squares estimates of the model parameters can give incorrect estimates of the standard errors and the associated goodness-of-fit chi-squared statistics. If the sample size is not large, for instance smaller than about 1000, asymptotic distribution-free estimation methods are also not applicable. This paper assumes that the observed variables are transformed to normally distributed variables. The non-normally distributed variables are transformed with a Box–Cox function. Estimation of the model parameters and the transformation parameters is done by the maximum likelihood method. Furthermore, the test statistics (i.e. standard deviations) of these parameters are derived. This makes it possible to show the importance of the transformations. Finally, an empirical example is presented.  相似文献   

18.
Multilevel structural equation modeling (multilevel SEM) has become an established method to analyze multilevel multivariate data. The first useful estimation method was the pseudobalanced method. This method is approximate because it assumes that all groups have the same size, and ignores unbalance when it exists. In addition, full information maximum likelihood (ML) estimation is now available, which is often combined with robust chi‐squares and standard errors to accommodate unmodeled heterogeneity (MLR). In addition, diagonally weighted least squares (DWLS) methods have become available as estimation methods. This article compares the pseudobalanced estimation method, ML(R), and two DWLS methods by simulating a multilevel factor model with unbalanced data. The simulations included different sample sizes at the individual and group levels and different intraclass correlation (ICC). The within‐group part of the model posed no problems. In the between part of the model, the different ICC sizes had no effect. There is a clear interaction effect between number of groups and estimation method. ML reaches unbiasedness fastest, then the two DWLS methods, then MLR, and then the pseudobalanced method (which needs more than 200 groups). We conclude that both ML(R) and DWLS are genuine improvements on the pseudobalanced approximation. With small sample sizes, the robust methods are not recommended.  相似文献   

19.
We suggest to use a factor model based backdating procedure to construct historical Euro‐area macroeconomic time series data for the pre‐Euro period. We argue that this is a useful alternative to standard contemporaneous aggregation methods. The article investigates for a number of Euro‐area variables whether forecasts based on the factor‐backdated data are more precise than those obtained with standard area‐wide data. A recursive pseudo‐out‐of‐sample forecasting experiment using quarterly data is conducted. Our results suggest that some key variables (e.g. real GDP, inflation and long‐term interest rate) can indeed be forecasted more precisely with the factor‐backdated data.  相似文献   

20.
We present a discussion of the different dimensions of the ongoing controversy about the analysis of ordinal variables. The source of this controversy is traced to the earliest possible stage, measurement theory. Three major approaches in analyzing ordinal variables, called the non-parametric, the parametric, and the underlying variable approach, are identified and the merits and drawbacks of each of these approaches are pointed out. We show that the controversy on the exact definition of an ordinal variable causes problems with regard to defining ordinal association, and therefore to the interpretation of many recently designed models for ordinal variables, e.g., structure equation models using polychoric correlations, latent class models and ordinal response models. We conclude that the discussion with regard to ordinal variable modeling can only be fruitful if one makes a distinction between different types of ordinal variables. Five types of ordinal variables were identified. The problems concerning the analysis of these five types of ordinal variables are solved in some cases and remain a problem for others.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号