Interest in data mining techniques has been increasing recently amongst the
actuaries and statisticians involved in the analysis of insurance data sets
which typically have a large number of both cases and variables. This paper
discusses the main reasons for the increasing attractiveness of using data mining
techniques in insurance. A case study is presented showing the application of
data mining to a business problem that required modeling risk in health insurance,
based on a project recently performed for a large Australian health insurance
company by PricewaterhouseCoopers (Sydney). The data mining methods discussed
in the case study include: Classification and Regression Trees (CART), Multivariate
Adaptive Regression splines (MARS) and hybrid models that combined CART tree
models with MARS and logistic regression. The noncommercially sensitive implementation
issues are also discussed.