Re/insurers engage in a difficult style of trading with a capped upside (the premium) and a potentially unlimited downside (the loss) delivering asymmetric risk/returns. Machine learning creates optionality that helps rebalance the asymmetry.
Machine learning can speed individual underwriting and pricing decisions, and help manage portfolio cycles by identifying patterns, trends, and correlations in historical data sets. Done well, the output can proactively anticipate potential shifts in market exposures and pricing direction.
To do so effectively, machine learning requires quality data (if you train ML on bad data- garbage in, garbage out), systems that can talk to one another (to enable model building and reduce latency/transposition errors), and output that can be explained (to build user trust and regulator confidence).
Data quality is the most important success factor in all machine learning applications. Unfortunately, re/insurers have data quality issues. The industry has no data standards or normalized data schema. Data arrives in multiple formats, more-or-less complete, that can be hard to extract (pdf), and even harder to transform and standardize.
And yet paradoxically, the hardest data to clean (because it is more bespoke/heterogeneous) is potentially the most valuable in terms of contributing to your alpha. Traders have known this since forever – we spend an inordinate amount of time finding, aggregating, and curating datasets to beat the market and competitors. The data set becomes the alpha.
How do re/insurers change? I see three broad paths to drive better data:
A shared necessity of data science projects (not just machine learning) is the interoperability of legacy systems. Re/insurers change hands, buy other companies and exchange books of business. They may have separate systems for underwriting, policy issuance, claims, finance and accounting, possibly with prior fixes (more akin to “glueware” than middle-ware) in place. This problem isn’t unique to re/insurers – the United States’ own Internal Revenue Service (IRS) has been dealing with the ramifications of their own legacy systems for years, and the lack of interoperability has caused glitches effecting millions of users.
There is no quick and easy solution to legacy system issues, and new MGAs and insurtechs have the advantage of starting clean with a new tabula rasa system - an advantage they often use to promote their products. For those with legacy systems, the answer lies in middleware solutions, data transformation strategies and incremental modernization. None are ideal, all are necessary.
As machine learning models become more prevalent, so too the need to be able to explain their output . A set of methods, known as Explainable AI or ‘XAI’, helps to provide a clear and interpretable explanation of model results.
Some of the model architectures are inherently more explainable than others. For example, decision trees can be explained by the binary decisions at each branch. Model-agnostic methods such as Shapley Values and Local Interpretable Model-Agnostic Explanations (LIME, phew!) can help to disambiguate many types of black box models. Shapley Values uses elements of cooperative game theory to describe the importance of different features in a model (‘features’ are machine learning model inputs that may contain relevant information to help the model make predictions on unseen data) while LIME attempts to transform more complex algorithms into more easily interpreted models, such as linear regression.
Some model architectures such as neural nets have a reputation for being more opaque, but methods to improve the explanations continue to be developed. Network Pruning seeks to remove unnecessary neurons from neural nets to make the model structure simpler and more explainable, while Activation Maximization visualizes the input patterns that most excite specific neurons (to be further iterated upon) which (1) helps optimize for strong activations in target neurons, and (2) helps explain why those neurons are selected.
Re/insurers continue to begin a slow adoption of machine learning. Progress is hampered by issues of data quality, systems interoperability, and model explainability. This creates an outsized opportunity for those who are willing to establish processes to optimize for success by fortifying these three machine learning adoption pillars.
Otakar G. Hubschmann leads TransRe’s Applied Data team, which researches, develops and deploys artificial intelligence and machine learning applications and datasets to help underwriters, actuaries and claims teams expand their understanding of client information, and improve client service. Contact Otakar with questions, comments, or ways the Applied Data team may help you.
The Best’s Company Report(s) reproduced on this site appear under license from A.M. Best and do not constitute, either expressly or implied, an endorsement of (Licensee)’s products or services. A.M. Best is not responsible for transcription errors made in presenting Best’s Company Reports. Best’s Company Reports are copyright © A.M. Best Company and may not be reproduced or distributed without the express written permission of A.M. Best Company. Visitors to this web site are authorized to print a single copy of the Best’s Company Report(s) displayed here for their own personal use. Any other printing, copying or distribution is strictly prohibited.
Best’s Ratings are under continuous review and subject to change and/or affirmation. To confirm the current rating, please visit the AM Best web site, www.ambest.com.