Actuaries and Data Scientists: How Can We Work Together and What Can We Bring to The Table?
Data science: A new toolset for the actuarial professional
Data science is increasingly pervading different parts of our lives, from recommender algorithms suggesting what we might like to watch on Netflix to ChatGPT passing MBA exams. As many of us know or have seen firsthand, data science has also impacted our work as actuaries. Some of us might even be concerned that we’ll be replaced by data scientists! As an actuary who has picked up some data science skills and frequently works with data scientists, I am here to tell you that data scientists and actuaries have different skill sets and aren’t meant to be interchangeable. These are two professions that can and should work in harmony with one another. Just as actuaries have gone from doing calculations on paper to using Excel, I believe that more actuaries will start using R and Python or other programming languages more often in the future. Data science is not a threat to our profession — rather, it’s a chance for us to learn new skills, evolve and become even more analytical.
Actuaries and data scientists: complementary skill sets
Actuaries and data scientists have different skill sets and knowledge bases, and, while there is some overlap, there are some fundamental differences that distinguish the two. Actuaries are highly analytical insurance professionals — we understand the insurance industry, the business problems that our company faces, the insurance data we work with and specific actuarial techniques. We study concepts like the principles of ratemaking and understand that creating a highly predictive pricing model (for example) isn’t our only goal — the model must also appropriately account for losses, not be unfairly discriminatory and be explainable to business partners and regulators.
Data scientists tend to be very technical. This isn’t to say that they can’t also understand business problems or learn more about the insurance industry — they simply have a different focus on their work. For example, a data scientist may be better positioned than an actuary to create a dashboard in R Shiny that shows loss ratios across different regions. While actuaries might focus on learning about emerging trends in the industry, a data scientist might be learning about the newest Python package that allows for visualizations or different types of modeling.
The descriptions above generalize “actuarial work” and “data science.” There is a lot of opportunity for variation in an actuary or a data scientist’s job, but I believe these descriptions capture the general focus of each career and the primary areas where they differ. Now that we know the differences between the two, let’s talk about how they overlap and can work together!
Let’s work together!
Actuaries and data scientists both have valuable skill sets, and it’s important to bring the two together. Let’s look at the example of creating a pricing model. If you want to learn more about this, take a look at CAS Monograph 5 on GLMs for insurance rating. We might follow a process like this:
1) Preparing, cleansing and exploring the data
Before we can start modeling, we need to have the data to create our model. This will probably involve joining data from several databases (such as joining policy and claims data). An actuary might explain to a data scientist colleague that a claim can occur several years after the policy was written, so we need to be careful when filtering dates. We need to assess different fields for reasonableness, such as premium and loss values. A negative loss value might not make sense to someone not familiar with the insurance industry, but an actuary can provide guidance to a data scientist and let them know that this might be a case of subrogation.
After cleansing the data, we’ll want to explore it through charts and graphs. Our data scientist friends can most likely create a variety of graphs for us to assess together. For example, are the loss ratios consistent across years? If not, maybe there has been a change in the business, and we want to exclude certain years from our analysis.
2) Feature engineering, model creation and model validation
Once we’ve finished cleansing the data, we can start modeling. We can think of useful variables that we might want to include, and a data scientist can calculate them. Actuaries and data scientists should work closely together during this process, which is highly iterative. Actuaries can provide guidance to data scientists about whether a variable should be included or not — maybe it’s highly predictive but not considered ethical to include, for example. To validate the model, a data scientist can create different diagnostics, like a Gini Index, and look at them together with an actuary to evaluate whether the model is “good” or needs to return to the model creation phase for further improvements.
3) Discussions with stakeholders and filing
Finally, it might be necessary to file the model with regulators or discuss the results with underwriters. An actuary can help guide these conversations, as they have similar industry backgrounds and may be more familiar with what each stakeholder is looking for. An underwriter might be looking for explainability and logic (for example, it’s easy to explain and understand why younger drivers tend to receive higher premiums), while a regulator might want to know if a variable is fair to the consumer. Actuaries might be able to frame these conversations in an analytical way to their data scientist colleagues and work together to create the necessary analysis to help answer these questions.
The future of actuarial science
Hopefully this article mirrors what you see in your own work or has given you a new perspective to working with data scientists. There are many opportunities for us to work together and produce new and interesting analyses that we might not be able to do on our own. Data science is not a threat to actuarial science, but rather an enhancement.