This is PART 3 of a four-part series.

Unstructured data, as we’ve seen, forms the bulk of data lying about unused with businesses and industries. Emails, texts, legal reports, voice recordings, the list of such data is humungous and untapped in many ways. Yet, this data can be the oil that keeps different components of businesses running and provides valuable undiscovered insights now.   

In the insurance industry, unstructured and structured data influence everything, from rating and pricing to claims cycles, litigation management, severity management, and underwriting. Given the sheer volume of available data, the insurance industry has devised a 6-factor framework to determine the quality and value of the data analyzed. When complemented with NLP-based technologies, they provide insurers with an overall framework to use raw data and convert them into actionable insights.


1. VOLUME – The term ‘big data’ comprising both structured and unstructured can be vague and immeasurable since insurers differ in their storage and analytics capabilities. Thus, what may seem huge for a small local insurer, maybe a drop in the ocean for a multinational corporation. The volume of big data is decided by insurers when reviewing their data storage systems. For example, on-premise or cloud and the various kinds of data utilized. It can impact many things, such as the carriers processing power and security.

2. VELOCITY – Large amounts of data need to be generated, collected, and processed quickly; this is where the speed is critical. Forbes suggests the production of 2.5 quintillion bytes of data a day, and insurers must have systems to aggregate such data volumes to ensure data analytics perform at the optimum.

3. VARIETY – More than the quantity of big data, the type of information available for insurers to derive insights from is critical. While structured data like texts and numbers are easy to derive, with NLP-based AI engines such as, there are more profound insights and patterns that predictive analytics can unlock. As a result, insurers can use unconventional information to design policies, detect fraud, and enhance the overall customer experience. What insurers thus get is a complete picture of every customer, their past behaviors, claims cycles, the behavior of individual claims, and so much more, all derived from continuous access to unstructured data.

4. VERACITY – Data doesn’t lie, goes a famous saying. Therefore, big data is a treasure of information that instills confidence in decision-making. However, although vast amounts of data are available and information is massive, some data could be unreliable and therefore needs to be optimized for genuineness. Making the most accurate decisions with large data sets depends on the strength of the predictive analytics platform and is critical for insurers.

5. VALIDITY – Data out there may be massive, but its relevance to the desired outcomes is an important criterion. When collecting data for insurance, it is critical to match data sets to the desired results (analytics). Setting up patterns in doing this increases outcomes’ accuracy from structured and unstructured data.

6. VALUE – Perhaps the most critical factor of big data in insurance is value. Among the primary purposes of importance, these two are essential; one is whether the data collected is good or bad; two, the other is whether insurers can make well-informed, business-driven decisions with deep insights from all this data. Data – especially unstructured – is generated in large amounts and can get watered down in quality. Therefore, having a platform that can semantically pick up relevant data with good quality, is reliable and consistent helps insurers to continue making data-driven decisions rather than guesswork estimations. In addition, insurers must know what to do with such data, where to use it, and make pricing and risk selection decisions accordingly.


Insurers can maximize the data they collect with the 6-V framework. An A.I. based predictive analytics engine like helps P&C insurers implement actionable in-depth insights into various operational, core business, and marketing areas. Predictive analytics can help insurers make data-driven decisions regarding;

  • Litigation Management – Charlee predicts litigation for up to 90-120 days in advance, with approximately 75% accuracy starting at the First Notice of Loss (FNOL) and predicts attorney involvement starting at FNOL
  • Severity Management – Behavioural patterns based on 160 + insurance fraud schemes have been modeled within Charlee to predict severity ranges starting at FNOL. This will help insurance carriers stay compliant
  • Reserve Management – Understanding deep insights behind historic risks and not just the numbers can help actuaries come up with better statistical reserves, which can be adjusted based on emerging risks within claims as detected by Charlee
  • Potential Fraud – Early detection and intervention to prevent fraud by understanding established patterns and claim-level deep insights will help insurers stay compliant
  • Claims Portfolio Management – Design customized KPIs to identify high-cost claim historic patterns and trends due to operational inefficiencies, regulatory inconsistencies, and non-compliance, help claims managers efficiently deploy resources, devise action plans and provide training
  • New Trends – Spotting emerging risks coming from Charlee’s ability to detect claim characteristics helps claims management and operations team to mitigate and reduce cycle time and expenses

The predictive analytics engine, built around the 6-V framework of data, ensures streamlined and relevant data analysis to ensure only the excellent quality insights and predictions help insurers reduce claims cycle time, and expenses while managing reserves efficiently such that insureds will stay loyal and out of litigation. 

Written by Charmaine Kenita and John Standish

PARTS 1 2 3 4

I would like to: