Testing Alternative Regression Frameworks for Predictive Modeling of Health Care Costs |
| |
Authors: | I. Duncan M. Loginov M. Ludkovski |
| |
Affiliation: | 1. Department of Statistics &2. Applied Probability, University of California, Santa Barbara, California |
| |
Abstract: | Predictive models of health care costs have become mainstream in much health care actuarial work. The Affordable Care Act requires the use of predictive modeling-based risk-adjuster models to transfer revenue between different health exchange participants. Although the predictive accuracy of these models has been investigated in a number of studies, the accuracy and use of models for applications other than risk adjustment have not been the subject of much investigation. We investigate predictive modeling of future health care costs using several statistical techniques. Our analysis was performed based on a dataset of 30,000 insureds containing claims information from two contiguous years. The dataset contains more than 100 covariates for each insured, including detailed breakdown of past costs and causes encoded via coexisting condition flags. We discuss statistical models for the relationship between next-year costs and medical and cost information to predict the mean and quantiles of future cost, ranking risks and identifying most predictive covariates. A comparison of multiple models is presented, including (in addition to the traditional linear regression model underlying risk adjusters) Lasso GLM, multivariate adaptive regression splines, random forests, decision trees, and boosted trees. A detailed performance analysis shows that the traditional regression approach does not perform well and that more accurate models are possible. |
| |
Keywords: | |
|
|