Nonlife Insurance Risk Classification Using Categorical Embedding

Abstract

This article presents several actuarial applications of categorical embedding in the context of nonlife insurance risk classification. In nonlife insurance, many rating factors are naturally categorical and often the categorical variables have a large number of levels. The high cardinality of categorical rating variables presents challenges in the implementation of traditional actuarial methods. Categorical embedding that is proposed in the machine learning literature for handling categorical variables has recently received attention in actuarial studies. The method is inspired by the neural network language models for learning text data and maps a categorical variable into a real-valued representation in the Euclidean space. Using a property insurance claims data set, we demonstrate the use of categorical embedding in three applications. The first shows how embeddings are used to construct rating classes and calculate rating relativities for a single insurance risk. The second concerns predictive modeling for multivariate insurance risks and emphasizes the effects of dependence on tail risks. The third focuses on pricing new products where transfer learning is used to gather knowledge from existing products.

Volume
Spring
Year
2023
Keywords
actuarial applications, categorial embedding, nonlife insurance, risk classification, trans-fer learning
Description
This article presents several actuarial applications of categorical embedding in the context of nonlife insurance risk classification. In nonlife insurance, many rating factors are naturally categorical and often the categorical variables have a large number of levels. The high cardinality of categorical rating variables presents challenges in the implementation of traditional actuarial methods. Categorical embedding that is proposed in the machine learning literature for handling categorical variables has recently received attention in actuarial studies. The method is inspired by the neural network language models for learning text data and maps a categorical variable into a real-valued representation in the Euclidean space. Using a property insurance claims data set, we demonstrate the use of categorical embedding in three applications. The first shows how embeddings are used to construct rating classes and calculate rating relativities for a single insurance risk. The second concerns predictive modeling for multivariate insurance risks and emphasizes the effects of dependence on tail risks. The third focuses on pricing new products where transfer learning is used to gather knowledge from existing products.
Publications
Casualty Actuarial Society E-Forum
Authors
Shi, Peng
Kun Shi
Formerly on syllabus
Off