Part 4: Simplifying the Underlying Technology Behind Unstructured Data Analysis

This is PART 4 of a four-part series.

Insurance companies for the most part have been sitting on mountains of unexplored data. They don’t know it exists, they don’t understand its value and because of this are missing out on insights right before their very eyes. This area in data management is also like a blind spot, throwing up only a partial picture of claims management, making insurance companies vulnerable and resulting in increased payouts, longer claims lifecycle, and delayed attorney intervention.

Data today has unlimited possibilities in solving many gaps and challenges. Almost 90% of such unstructured data across industries is being currently studied to understand what all can be derived from it. This revolution is happening at an unprecedented scale.

Given the impact of this, it is critical to understand some of the technology unraveling unstructured data, and its implications at a deeper level on the insurance business.

When it comes to the insurance industry, the major impact of big data is being seen in the claims departments. From text mining to semantics, data analytics is reshaping the entire claims process and especially claims triaging, process efficiency, attorney intervention, and fraud detection.

Unfortunately reading and understanding this unstructured data is only in the purview of a few technology-enabled businesses who can read, understand, map patterns and analyze them. When it comes to claims; police notes, claimants’ statements, witness accounts and adjuster notes need to be processed and cleaned out before any insights are derived from them, a process needing tailored technology engines while the data is put to actionable use.


Text mining is a relatively simple way of uncovering the hidden value in mountains of unstructured data. This process extracts useful information from unstructured text, scanning large amounts of data for phrases or keywords (like Google searches). But unlike the search engine, an artificial engine such as Charlee™ uses Natural Language Processing (NLP) to go beyond simply finding information to analyzing and finding patterns and signatures in the media for critical facts, and relationships within that data.

AI-based engines, therefore, have the enhanced capability of scanning reports across media, interpreting adjuster notes, and discovering sentiments of the involved parties – from their product opinion to feelings after filing a claim – all just by reading through tonnes of material.


Among insurance processes, the claims department has the most to benefit from technological intervention of text mining tools. Analyzing things such as claim filing call transcripts reveals several insights, helping better structure call center operations and picking up specific tones, keywords and references. Locating keywords that indicate potential for complicated claims, any move towards litigation can necessitate a close look, or get an experienced adjuster involved earlier in the process.

Text mining of such claims also sharpens the processes of the insurance company, helping develop new products and plugging gaps in existing insurance coverage that may be negatively impacting customer retention and renewals. This mining also throws up interesting insights on demographics for claims sets, specific locations prone to certain risks and even the claim areas to intervene before escalation sets in.


Identifying the fraud potential of claims sets using text mining is perhaps the biggest benefit for insurance companies. Big data techniques are able to identify claimants likely to commit fraud in the underwriting stage and claims phase. Red flags not spotted by claims adjusters manually can pull up interesting insights. For example, multiple people using the same language when filing claims can be pulled up as suspicious by an algorithm, otherwise missed if several people are handling the claims. Similar physicians, witnesses coming up across various unrelated claims are a fraud red flag which such technology can pick and identify.


The creation of NLP-based algorithms to extract insights from data efficiently and present results in a meaningful, revelatory way gives insurers an edge in streamlining and optimizing so many of their business processes. This is a collaborative effort where subject matter experts (SMEs) join hands with data analytics engines and tools to make sense of everything. In Charlee™ for example, an expertise of the subject gives us the edge in extracting relevant data, create customized KPI models and present insurance insights that deep dive into the claims management process. It is for this reason that litigation prediction happens 90-120 days in advance using our predictive analytics engine with approximately 85% accuracy starting at FNOL.


While data models and claims prediction engines are disrupting the way insurance industry processes function, it is deep subject matter expertise collaborating with data algorithms that can identify meaningful trends in text mining. Insurers and claims adjusters will have a critical advantage to triage and track complicated claims, make sense of unstructured data to be one step ahead of claimants, and continue refining processes to derive more purposeful value that can help the industry as a whole.

PARTS 1 2 3 4

Written by: Charmaine Kenita and John Standish

I would like to: