AI and Machine Learning in Re/Insurance series: The 3 Pillars of Machine Learning Adoption

Re/insurers engage in a difficult style of trading with a capped upside (the premium) and a potentially unlimited downside (the loss) delivering asymmetric risk/returns. Machine learning creates optionality that helps rebalance the asymmetry.

Machine learning can speed individual underwriting and pricing decisions, and help manage portfolio cycles by identifying patterns, trends, and correlations in historical data sets. Done well, the output can proactively anticipate potential shifts in market exposures and pricing direction.

To do so effectively, machine learning requires quality data (if you train ML on bad data- garbage in, garbage out), systems that can talk to one another (to enable model building and reduce latency/transposition errors), and output that can be explained (to build user trust and regulator confidence).

Data Quality

Data quality is the most important success factor in all machine learning applications. Unfortunately, re/insurers have data quality issues. The industry has no data standards or normalized data schema. Data arrives in multiple formats, more-or-less complete, that can be hard to extract (pdf), and even harder to transform and standardize.

And yet paradoxically, the hardest data to clean (because it is more bespoke/heterogeneous) is potentially the most valuable in terms of contributing to your alpha. Traders have known this since forever – we spend an inordinate amount of time finding, aggregating, and curating datasets to beat the market and competitors. The data set becomes the alpha.

How do re/insurers change? I see three broad paths to drive better data:

Your boss (nothing drives action like being told to do it = stick)
More commission (cash for better data = carrot)
Better insight (give us better data, get improved insight/feedback = carrot)

Systems Interoperability

A shared necessity of data science projects (not just machine learning) is the interoperability of legacy systems. Re/insurers change hands, buy other companies and exchange books of business. They may have separate systems for underwriting, policy issuance, claims, finance and accounting, possibly with prior fixes (more akin to “glueware” than middle-ware) in place. This problem isn’t unique to re/insurers – the United States’ own Internal Revenue Service (IRS) has been dealing with the ramifications of their own legacy systems for years, and the lack of interoperability has caused glitches effecting millions of users.

There is no quick and easy solution to legacy system issues, and new MGAs and insurtechs have the advantage of starting clean with a new tabula rasa system - an advantage they often use to promote their products. For those with legacy systems, the answer lies in middleware solutions, data transformation strategies and incremental modernization. None are ideal, all are necessary.

Model Explainability

As machine learning models become more prevalent, so too the need to be able to explain their output . A set of methods, known as Explainable AI or ‘XAI’, helps to provide a clear and interpretable explanation of model results.

Some of the model architectures are inherently more explainable than others. For example, decision trees can be explained by the binary decisions at each branch. Model-agnostic methods such as Shapley Values and Local Interpretable Model-Agnostic Explanations (LIME, phew!) can help to disambiguate many types of black box models. Shapley Values uses elements of cooperative game theory to describe the importance of different features in a model (‘features’ are machine learning model inputs that may contain relevant information to help the model make predictions on unseen data) while LIME attempts to transform more complex algorithms into more easily interpreted models, such as linear regression.

Some model architectures such as neural nets have a reputation for being more opaque, but methods to improve the explanations continue to be developed. Network Pruning seeks to remove unnecessary neurons from neural nets to make the model structure simpler and more explainable, while Activation Maximization visualizes the input patterns that most excite specific neurons (to be further iterated upon) which (1) helps optimize for strong activations in target neurons, and (2) helps explain why those neurons are selected.

Conclusion

Re/insurers continue to begin a slow adoption of machine learning. Progress is hampered by issues of data quality, systems interoperability, and model explainability. This creates an outsized opportunity for those who are willing to establish processes to optimize for success by fortifying these three machine learning adoption pillars.

AI and Machine Learning in Re/Insurance series:
The 3 Pillars of Machine Learning Adoption

By Otakar G. Hubschmann

Data Quality

Systems Interoperability

Systems-Interoperability

Model Explainability

Conclusion

JUNE 2023

Our Research & Resilience Efforts

Our Global Volunteerism

AI and Machine Learning in Re/Insurance series:The 3 Pillars of Machine Learning Adoption

By Otakar G. Hubschmann

Data Quality

Systems Interoperability

Systems-Interoperability

Model Explainability

Conclusion

JUNE 2023

Our Research & Resilience Efforts

Our Global Volunteerism

AI and Machine Learning in Re/Insurance series:
The 3 Pillars of Machine Learning Adoption