Automating paper-to-electronic healthcare claims processing with AWS

Health plans process billions of claims electronically each year. Council for Affordable Quality Healthcare (CAQH) estimates that approximately 10% of claims still arrive as paper documents, accounting for hundreds of millions of paper submissions annually in the U.S. These paper claims create processing bottlenecks and consume a disproportionate share of operational costs and resources, with manual processing costing up to 10 times more than electronic claims while extending reimbursement cycles from hours to weeks.

Claims operations teams currently process multiple paper formats—CMS-1500 for professional services, UB-04 for hospital services, ADA Dental forms, and custom proprietary forms—through labor-intensive workflows, digitizing forms and entering data. Unlike electronic claims that flow through standardized Electronic Data Interchange (EDI) pipelines, paper claims need separate systems using OCR technology and proprietary formats. This dual-pathway approach introduces higher error rates and elevates HIPAA compliance risks. For the average health plan, these inefficiencies translate to millions in annual administrative overhead that could otherwise be directed toward member services and care management initiatives.

To address these paper-processing challenges, we’ve built a targeted solution that integrates seamlessly with their existing electronic claims processing while minimizing manual intervention. In this post, we demonstrate how health plans can automate the paper-to-electronic claims transformation process using AWS Transfer Family web apps, Amazon Bedrock Data Automation (BDA), and AWS B2B Data Interchange. While our solution focuses on converting UB-04 hospital claim forms into the 5010 X12 837 format (the HIPAA mandated electronic claims standard), this same architecture can be applied to automate paper-based workflows in other industries including insurance, supply chain, retail, and manufacturing where businesses need to transform physical documents into standardized electronic data interchange (EDI) formats.

Solution overview

This solution transforms paper healthcare claims into standardized electronic transactions through a streamlined, three-stage workflow that uses the AWS serverless architecture, as shown in the following figure. Combining document processing, AI-powered data extraction, and EDI transformation capabilities allows health plans to eliminate manual data entry while maintaining HIPAA compliance.

The following diagram illustrates the architecture:

Architecture diagram showing event driven transformation of paper-claims data into 837 transactions

Figure 1. Architecture diagram showing event driven transformation of paper-claims data into 837 transactions

The end-to-end workflow is as follows:

Stage 1: Secure file ingestion with AWS Transfer Family web apps

1. Claims operations teams upload scanned paper claims through the Transfer Family web app interface, which provides secure browser-based access to Amazon S3 buckets.

Stage 2: Intelligent data extraction with BDA

2. Amazon S3 detects new uploads to the input S3 bucket prefix and emits a corresponding Amazon EventBridge event. Amazon EventBridge rules detect these events and trigger AWS Lambda functions to process the claims.

3. The Lambda function invokes the BDA to intelligently extract structured data from the claim forms.

Stage 3: Transformation and EDI generation with AWS B2B Data Interchange

4. B2B Data Interchange monitors the S3 location for new JSON documents and automatically transforms the extracted data to standardized 837 EDI transactions.

5. The resulting EDI files are delivered to an S3 bucket, ready for integration with the health plan’s existing claims processing system.

This architecture delivers significant business outcomes that directly impact a health plan’s bottom line:

Up to 80% reduction in per-claim processing costs through intelligent data extraction and automated processing
Reduced processing time from days to minutes, accelerating provider reimbursement cycles
Improved data accuracy with lower error rates compared to manual processing
Enhanced HIPAA compliance through end-to-end encryption, fine-grained access controls, and audit logging
Increased operational efficiency, allowing staff to focus on higher-value member service activities instead of data entry
Ability to scale seamlessly from hundreds to millions of claims annually without capacity planning concerns

All the services used in this solution are HIPAA Eligible Services, meaning they are eligible for workloads involving electronic Protected Health Information (ePHI) when covered by a Business Associate Addendum (BAA) with AWS.

Unlike traditional OCR and EDI solutions requiring significant upfront investments, this serverless approach provides consistent performance during peak processing periods while minimizing infrastructure management overhead.

In the following sections, we explore each stage of this workflow in detail, examining how AWS services work together to create a seamless claims processing experience while addressing key operational challenges: reducing manual processing errors, accelerating reimbursement cycles, and lowering costs.

Stage 1: Secure file ingestion with AWS Transfer Family web apps

The first challenge in automating paper claims processing is secure, user-friendly document intake. Claims operations teams need a clear, secure way to upload digitized copies of paper-based claims. Transfer Family web apps is a fully managed, no-code, browser-based web app that allows authenticated users to list, upload, download, copy, and delete files in specific Amazon S3 buckets. This solution provides claims operations teams with a zero-code interface for securely submitting healthcare forms.

The service authenticates users through AWS Identity and Access Management (IAM) Identity Center, allowing staff to use their existing organizational credentials through industry-standard protocols (SAML 2.0 or OIDC). After authentication, users access a customized and branded web portal where they can upload claim forms (CMS-1500, UB-04, ADA dental or custom) using intuitive drag-and-drop functionality, as shown in Figure 2.

Transfer Family web apps interface showing file management options

Figure 2. Transfer Family web apps interface showing file management options

The web interface not only simplifies the upload process but also provides auditing capabilities, recording each action in comprehensive audit trails that support HIPAA compliance requirements. Users can perform both individual and batch uploads depending on their workflow needs. Figure 3 demonstrates how the uploaded UB-04 forms are stored.

UB-04 form upload to inbound folder

Figure 3. UB-04 form upload to inbound folder

Security is maintained throughout the process: all files are encrypted in transit using SSL/TLS and at rest using AWS Key Management Service (AWS KMS) in Amazon S3. When new forms arrive in the designated input claims prefix, an EventBridge rule automatically triggers a Lambda function, forwarding them to Stage 2 of our pipeline where BDA performs intelligent data extraction.

Stage 2: Intelligent data extraction with BDA

After secure ingestion, accurately extracting structured data from complex healthcare forms is the next critical challenge. Bedrock Data Automation (BDA) is a fully managed feature within Amazon Bedrock, that streamlines the extraction of valuable insights from unstructured multimodal content such as documents, images, audio, and video. BDA provides a unified API experience that eliminates the need to manage multiple AI models and services, with built-in visual grounding and confidence scores to validate extracted data.

For document and image types, BDA enables you to define when data should be extracted as-is (standard) and when it should be inferred (custom output), giving complete control over the process. This flexibility is provided through blueprint configurations.

Extracting healthcare data with BDA blueprints

In BDA, blueprints define how to extract and process data from documents. Each blueprint contains field definitions, instructions, validation rules, and output schema specifications that can be tailored to different healthcare form types. Blueprints can be created using natural language prompts in the console or defined manually for more precise control.

Technical teams should create separate blueprints for CMS-1500, UB-04, ADA dental forms and custom forms, refining them over time as needed. Each blueprint specifies exactly which data elements to extract from each form type and how they should be structured.

Figure 4 shows a blueprint configuration for UB-04 forms in the AWS console, where you can define fields such as patient information, diagnosis codes, and procedure details.

UB-04 form extraction in the AWS console showing field level instructions, results and confidence scores

Figure 4. UB-04 form extraction in the AWS console showing field level instructions, results and confidence scores

When processing a claim form, the Lambda function triggered by EventBridge invokes the BDA APIs to analyze the document, classify it as a specific form type, apply the appropriate extraction blueprint, and extract the relevant healthcare data. The resulting structured JSON document contains all information from the original paper claim, ready for downstream processing.

Figure 5 displays a structured JSON output generated by Amazon Bedrock Data Automation from the UB-04 form. This transformation from unstructured document data to structured, machine readable fields is critical for downstream processing, with each key-value pair representing critical claim information extracted from the paper form.

JSON document with fields extracted from the UB-04 form using BDA

Figure 5. JSON document with fields extracted from the UB-04 form using BDA

The following shows a partial example of extracted data in JSON from a sample UB-04 document, highlighting some of the key fields:

{
    "claim": {
        "admit_diagnosis": "E871",
        "statement_covers_period": "01012024 through 03012024",
        "patient_birthdate": "08221967",
        "provider_address": "555 Main St",
        "npi": "0011002200"
	}
}

When extraction completes, the Lambda function writes this structured JSON document to a designated Amazon S3 prefix monitored by B2B Data Interchange, creating a seamless handoff to the EDI transformation stage without manual intervention.

Stage 3: Transformation and EDI generation with AWS B2B Data Interchange

Structured data extracted from paper claims means that the final crucial step is transforming this information into standardized HIPAA 5010 X12 837 transactions that can be processed by existing claims systems. Electronic Data Interchange (EDI) is the standardized format used throughout healthcare for electronic transactions between providers, payers, and other stakeholders.

B2B Data Interchange automates the transformation and generation of Electronic Interchange Data (EDI) documents to and from JSON and XML data formats. This service provides health plans with a low-code interface and pay-as-you-go pricing that reduces the time, complexity, and cost associated with managing and exchanging transactional data across organizational boundaries.

Understanding the EDI transformation pipeline

B2B Data Interchange streamlines the transformation from extracted UB-04 JSON data to standardized 837 Institutional transactions through four components:

Profiles: Store your organization’s business details and contact information. This information is shared with your trading partners.
Transformers: Define how extracted JSON data from BDA maps to standardized EDI segments in the 837 formats.
Trading capabilities: Establish automated processing pipelines that monitor S3 locations and trigger transformations.
Partnerships: Connect the components together to enable automated document exchange.

A significant innovation in B2B Data Interchange is its generative AI-assisted EDI mapping capability powered by Amazon Bedrock. The “Generate Mapping” feature shown in Figure 6, analyzes your extracted claim data and sample EDI transactions to automatically suggest appropriate mappings, reducing implementation time from days to minutes while making EDI transformation accessible to teams without specialized expertise.

AWS B2B Data Interchange with Generate Mapping capability

Figure 6. AWS B2B Data Interchange with Generate Mapping capability

Once configured, the entire transformation process becomes automated:

B2B Data Interchange monitors an Amazon S3 prefix for new JSON documents from BDA
When detected, it transforms the structured JSON data into X12 837 documents
Optionally validates the transformed documents against X12 standards
The resulting EDI files are delivered to a designated output directory for integration

The following is a sample segment from the generated X12 file, showing key components of the standardized healthcare claim:

ST*837*0001*005010~ (Transaction start segment)
BHT*0019*00*0123*20250422*2020*CH~ (Header information)
NM1*41*2*Acme Hospital*****46*12345~ (Provider data)
PER*IC*JOHN DOE*TE*1234567890~ (Contact information)

The final output of this process is a set of standardized EDI files ready for integration with your existing claims processing system. Each file contains all structured healthcare information extracted from the original paper form. These files follow standard naming conventions and are organized by trading partner ID for easy integration.

HIPAA 5010 837 X12 written to the output prefix Amazon S3 bucket

Figure 7. HIPAA 5010 837 X12 written to the output prefix Amazon S3 bucket

All transformation activities are logged to Amazon CloudWatch and emit events to Amazon EventBridge, enabling comprehensive monitoring of document volumes, statuses, and errors. This enables operations teams to track document volumes, transformation status, and processing errors through custom built CloudWatch dashboards.

Take the next step

Now that you understand how the solution works, you can implement it in your AWS environment. Before launching the CloudFormation stack, complete these prerequisites:

Prerequisites

Configure AWS IAM Identity Center
1. Enable AWS IAM Identity Center following the documentation
2. Configure your identity source (AWS Managed or external IdP)
3. Create a group named ‘ClaimsOperationsTeam’ following the quick start guide
4. Note the Group ID from the General Information section
5. Configure MFA settings as needed
Enable Access Grants in your region
1. Follow the documentation to create access grant instance and associate it with your IdC ARN

Deploy the solution

Use the “Launch Stack” button below to deploy the AWS CloudFormation template to your AWS account

When prompted provide these parameters:

BusinessName: Your company name
GranteeIdentifier: Your group ID from step 1
IdentityCenterInstanceArn: Your Identity Center ARN (format: arn:aws:sso:::instance/ssoins-xxxxxxxxxxxxxxxx)
InboundPath: inbound/
NamePrefix: paper-claims
OutboundPath: outbound/

Verify your implementation

After the deployment completes

Assign the ‘ClaimsOperationsTeam’ group to the web app in the AWS Transfer Family console
Access the web app URL and sign in
Upload a sample UB-04 file to the inbound folder (download sample file)
Verify the EDI output appears in the outbound folder

Cleaning up

To avoid incurring future charges, delete the CloudFormation stack. For instructions, refer to deleting a stack on the CloudFormation console.

Conclusion

In this post, we demonstrated how to transform paper healthcare claims into electronic formats using AWS services. This solution helps health plans cut processing costs by up to 80%, reduce processing time from days to minutes, and improve data accuracy—all while meeting HIPAA requirements. By automating these labor-intensive processes, organizations can redirect valuable resources from manual data entry to member-focused initiatives. Begin your claims automation journey with our CloudFormation template and see how it can address your specific processing challenges.

AWS Storage Blog