Table of Contents
The project involves analyzing claims data acquired from various insurance providers across 30+ countries to aid in premium negotiations and mitigate client losses. The project adopts file format standardization and data transformations to address these issues, ensuring a unified and usable dataset. Additionally, fuzzy matching techniques are employed to link claims data with incident numbers from the legal system to ensure data accuracy and integrity.
The client is a global technology giant that operates through a mobile app. With a presence in over 900 metropolitan areas worldwide, the client has revolutionized the transportation industry, providing convenient and affordable alternatives to traditional taxis. The client is also actively expanding into other logistics areas.
- Standardization and Transformation of Claims Data: Organize diverse insurance provider data into a coherent dataset by aligning inconsistent structures and languages.
- Data Linkage through Fuzzy Matching: Implement fuzzy matching to connect claims with incident records, providing a complete incident overview.
- Consolidation of Claims Data and Financial Metrics Derivation: Unify regional claims data for insights into financial performance, calculating metrics like losses and recoveries.
- Data Validation for Accuracy and Integrity: Ensure reliable data through thorough validation, thereby empowering informed decision-making for the client.
Our scalable data ingestion framework effectively onboards new insurance carriers and ingests data into the Hadoop ecosystem’s data layers. Here’s a quick overview of its key components:
Sources: Claims files from insurance providers are stored on Google Drive and Box, while the claims to incident map files from the actuarial team are stored on Google Drive. The legal data is also extracted from Salesforce for further data processing. Through the deployment of containerized microservices and API-driven integration architecture, Indium rationalized provider onboarding, minimizing system reconfiguration efforts and achieving cost-effective scalability.
Data Lake Creation: Data is extracted and quality-checked using the file check quality Python framework. Valid files are moved to the HDFS Data Lake.
Raw Layer: Data format is standardized for a consistent single source of truth and loaded into raw tables.
Processed Layer: Data is extracted, transformed, and standardized as per agreed rules, then loaded into processed tables.Consolidation Layer: Using Python, Data from different providers’ tables are consolidated and linked to legal/incident information using fuzzy name matching.
Indium implemented automated ETL pipelines leveraging data standardization algorithms, reducing manual intervention in data preparation, resulting in optimized resource allocation and improved operational focus.
Data Validation & Reconciliation: This robust framework developed using Python verifies financials and checks for data accuracy, increasing the overall process’s competence.
By integrating anomaly detection models and real-time data validation scripts, Indium expedited data quality assessments, accelerating the insurance claim onboarding process and facilitating swift data-driven insights.
Exploratory Analysis and Visualization: The client gains insights into incidents and claims patterns using Hive/Presto for queries and Tableau for visualizations. Utilizing advanced predictive modeling and robust data visualization tools, Indium empowered the client to extract actionable insights from claims data, enabling dynamic pricing strategies and optimizing risk assessment protocols for premium negotiation.
This framework streamlines data handling, ensuring accurate insights and effective decision-making.
Enhanced Operational Efficiency: By saving approximately 190 man-hours per month through automated data standardization and consolidation, the client’s team can focus on more strategic tasks, improving operational efficiency.
Faster Time-to-Market: The 2-3 weeks reduction in manual effort for checking data quality during new insurance claim onboarding allows for quicker data processing and analysis. As a result, the client can onboard new insurance providers swiftly and efficiently.
Cost Savings: The highly scalable architecture and streamlined onboarding process have led to a notable reduction in effort and cost required to integrate new insurance providers into the system.
Data-Driven Decision Making: With improved analytics on claims data, the client can make more informed decisions during premium negotiations, leading to optimized insurance costs and saving 5-10% on total premiums paid.