CASE STUDY: MODELING RISK IN HEALTH INSURANCE - A DATA MINING APPROACH

Inna Kolyshkina and Richard Brookes
PricewaterhouseCoopers Actuarial Sydney

Abstract

Interest in data mining techniques has been increasing recently amongst the actuaries and statisticians involved in the analysis of insurance data sets which typically have a large number of both cases and variables. This paper discusses the main reasons for the increasing attractiveness of using data mining techniques in insurance. A case study is presented showing the application of data mining to a business problem that required modeling risk in health insurance, based on a project recently performed for a large Australian health insurance company by PricewaterhouseCoopers (Sydney). The data mining methods discussed in the case study include: Classification and Regression Trees (CART), Multivariate Adaptive Regression splines (MARS) and hybrid models that combined CART tree models with MARS and logistic regression. The noncommercially sensitive implementation issues are also discussed.