Crafting an automated document review solution using Generative AI
At a glance
MRH Trowe provides insurance, finance, and risk management services to help businesses and individuals protect assets, mitigate risks, and achieve financial goals.
Challenge
Solution
Creating an automated solution that can pull desired data from multiple documents and list them in a simple to view and send CSV file.
Services used
- Firemind’s PULSE
- Amazon SageMaker
- Amazon Bedrock
- Amazon S3
Outcomes
- 7 minutes extraction time (from 2.5 hours).
- 4 month turnaround from first meeting to project sign-off.
Business challenges
Freeing up valuable time for account managers
MRH Trowe submits real estate data to insurance companies on behalf of human reviewers, brokering communications with insurers and providing real estate insurance quotes. To achieve this, MRH Trowe receives unstructured, loosely formatted PDF files (known as “reviews”) containing required information (so-called “green fields”) about the property for insurance companies to review. Optional property information is referred to as “yellow fields.”
MRH Trowe account managers review these complicated and lengthy documents manually, extracting green and yellow fields into an Excel sheet for insurance review. This time-consuming and error-prone process reduces the bandwidth of MRH Trowe account managers, sometimes resulting in a loss of business. MRH Trowe needed an automated document process solution which expedites green and yellow field extraction. Once extracted into an Excel document, the final artifact could be reviewed by MRH Trowe account managers before being sent to the insurance company for final review.
What our customers say
Hear directly from those who’ve experienced our services. Discover how we’ve made a difference for our clients.
Dr. Malte Polley
"The solution is significantly outperforming OpenAI and providing much better results."
Solution
Automating data collation using generative AI
This project involved successfully designing and deploying an AWS-based proof-of-concept (PoC) that demonstrated the efficiency of Large Language Model (LLM) inference in transforming unstructured data into structured documents. The objective was to process PDF documents through a well-coordinated AWS workflow, leveraging services such as Amazon Textract, AWS Lambdas, Amazon SageMaker, and AWS Step Functions.
The process initiated with PDF document uploads to an Amazon S3 bucket, triggering Textract to extract raw text. A Lambda function then preprocessed the data into manageable chunks. An LLM on SageMaker was utilised for real-time inference, activated during document processing and deactivated post-processing.
AWS Step Functions orchestrated the workflow, encompassing two crucial “map” methods. The first processed prompts from a document, while the second handled text chunks. A final reasoning Lambda extracted defining information from LLM outputs for each prompt and chunk. The resulting information was successfully transformed into the required formats for further processing by MRH Trowe.
Acknowledging the experimental nature of LLMs, the project’s primary deliverable was the AWS infrastructure, showcasing the LLM’s capabilities with unstructured data. The outlined solution provided a versatile framework, setting the stage for future developments, refinements and use cases, beyond the PoC stage.
Supercharged extraction
MRH Trowe account managers reviewed these complicated and lengthy documents manually, with the average time taken for an account manager to review being at the 2.5 hour mark. Our solution is dramatically reducing that time to around 7 minutes, enabling account managers to work on other essential tasks and increase their available time to speak with current and potential clients, maximising their relationships and supporting further growth for the business.
Higher accuracy
The solution will produce higher accuracy ratings, across both green and yellow field data. As the files can be prompted against using a more consistent flow, that can be refined over time, the data review and overall process will become more accurate over time, producing consistent results that free account managers to work on more pressing tasks.
Model Spotlight
Anthropic is an artificial intelligence research company based in the San Francisco Bay Area. Founded in 2021, the company focuses on developing safe and ethical AI systems, particularly AI assistants capable of open-ended dialogue and a wide range of tasks.
Anthropic has created notable models like Claude, and explores techniques such as ‘constitutional AI’ to imbue their AI with robust ethical principles. Led by a team of prominent AI researchers, Anthropic is positioning itself as an emerging leader in the field of beneficial AI development, working to ensure AI capabilities advance in alignment with human values.
Claude 3 Haiku
We initially used Anthropic’s Claude 2 for this project due to its 100K token context length, enabling MRH Trowe to process large, complex documents quickly and efficiently. during the project, Anthropic’s Claude 3 Haiku was released, so we immediatley switched to the new model.
This model reduced document extraction time from 2.5 hours to just 7 minutes, significantly increasing productivity. The model’s ability to handle large inputs while maintaining accuracy allowed the project to be completed within just four months, dramatically improving both the speed and precision of data extraction.
Get in touch
Want to learn more?
Seen a specific case study or insight and want to learn more? Or thinking about your next project? Drop us a message!