Method: We have used Proc varclus (SAS/STAT) to find clusters of variables defined at a geographical level and attached to a database of automobile policies. The procedure finds cluster of variables which are correlated between themselves and not correlated with variables in other clusters. Using business knowledge and 1-R2 ration, cluster representatives can be selected, thus reducing the number of variables. Then, the cluster representatives are input in the predictive model.
Conclusions: The procedure used in the paper for variable clustering quickly reduces a set of numeric variables to manageable reduced set of variable clusters.
Availability: proc varclus from SAS/STAT has been used for this study. We found an implementation of variable clustering in R, function varclus, while we did not experiment with it.
Keywords: variable reduction, clustering, statistical method, data mining, predictive modeling.