Migration & Modernization
Modernizing Microservices using Amazon EKS for Hybrid Cloud
This post is co-written with Balaji Kandasamy and Suresh Kannan from athenahealth.
This blog explores how athenahealth reduced operational burden, improved developer productivity and hybrid cloud capabilities as they transformed from a monolithic to microservices architecture using Amazon Elastic Kubernetes Service (Amazon EKS).
Navigating the complexities of EKS Consolidation
As athenahealth modernized monolithic applications to unlock agility and efficiency microservices they faced challenges in managing their sprawling Kubernetes infrastructure. The array of Kubernetes clusters across multiple AWS accounts led to operational overhead and duplicated efforts. Despite having a central DevOps team, developers were burdened with responsibilities in managing individual Kubernetes clusters for their projects, diverting their focus from core application development.
Empowering Developers Through Centralized EKS Management
To address these challenges, athenahealth consolidated their Kubernetes clusters using centralized Amazon EKS management. The new approach simplified work for development teams, as they did not have to deal with inconsistent configurations and security policies across multiple clusters. It freed up time for development teams to focus on building and deploying new features without having to manage the underlying infrastructure. This resulted in faster development cycles, fostering greater innovation and rapid iteration.
Embracing Hybrid Cloud for Enhanced Flexibility
To solve the challenge of managing resources across diverse environments, athenahealth leveraged the flexibility of hybrid cloud deployments. Their architecture seamlessly connected resources across multiple environments such as data centers, AWS Outposts, AWS Local Zones, and availability zones within AWS (illustrated in Figure 1).
athenahealth uses AWS Outposts to bridge their on-premises data centers with AWS cloud services. This hybrid setup is crucial for running latency-sensitive healthcare applications like Electronic Health Records (EHR) applications and patient portals. While these applications could run in athenahealth’s own data centers, AWS Outposts offers unique advantages. It allows development teams to use the same AWS tools and services both on-premises and in the cloud, making it easier to gradually break down larger applications into microservices.
The setup works by running application containers locally on Outposts while keeping the management layer in AWS Regions. This approach offers better performance than using AWS Direct Connect alone, as Outposts brings AWS services directly into the local environment rather than just providing network connectivity.
To further improve application performance, athenahealth also uses Local Zones in cities like Boston and Dallas. These Local Zones allows applications to run closer to end users while still being part of the same EKS clusters used in main AWS regions. This comprehensive approach ensures consistent operations and optimal performance across all environments.
For centralized AWS accounts, products (for example, Product 1 and Product 2 in Figure 1) operate across multiple regions (us-east1 and us-west2), utilizing the full potential of EKS for container orchestration. Resources like Amazon Simple Queue Service (Amazon SQS), and Amazon Simple Storage Service (Amazon S3) are distributed across regions to ensure availability and fault tolerance. Each EKS cluster is designed with a consistent network configuration, including public and private subnets, and uses Kong Ingress controllers to manage traffic across the clusters, ensuring scalability and security.
This multi-environment setup supports the flexibility to scale applications dynamically through multiple dimensions: horizontal pod autoscaling for handling increased request loads, vertical scaling for resource-intensive workloads, and cluster-level scaling via Karpenter for infrastructure expansion. The architecture enhances resilience by allowing failover between regions. The modular structure (with EKS clusters, databases, and other AWS services depicted in Figure 1 as building blocks) allows teams to independently develop, deploy, and manage their workloads, ensuring both operational efficiency and security in a hybrid cloud infrastructure. Security is achieved through a combination of AWS-native features and Kubernetes best practices, including network segmentation with public and private subnets, fine-grained access control using AWS Attribute-Based Access Control (ABAC) and AWS Identity and Access Management (IAM) , and encryption of data both at rest and in transit. Additionally, container-specific security measures such as image scanning and pod security policies are implemented within EKS to further enhance the overall security posture.
Figure 1: AWS footprint in athenahealth
Build Control plane using Crossplane
While AWS infrastructure manages the Kubernetes control plane for EKS clusters (handling core Kubernetes components like the API server, scheduler, and controller manager), athenahealth uses Crossplane as a separate infrastructure control plane to manage cloud resources across multiple AWS accounts. Crossplane operates as a Kubernetes-native infrastructure provisioning tool, ensuring the actual state of cloud resources (such as EKS clusters, databases, and networking components) matches the declared state in Git repositories. It offers seamless GitOps integration, automated drift correction, stateless operation, and continuous synchronization, thereby simplifying infrastructure management across the organization’s AWS footprint. This infrastructure control plane (illustrated in Figure 2) excels in managing multiple EKS clusters across AWS accounts and integrates effortlessly with FluxCD for a cohesive deployment workflow. By focusing on declared resources and the actual state, it eliminates the need for manual state management and prevents configuration drift in the broader cloud infrastructure, complementing the core Kubernetes functionality that AWS manages within each EKS cluster.
Figure 2: Management account acts as control plane to manage Multiple EKS clusters
Leveraging Open-Source Tools for Enhanced Efficiency
athenahealth’s approach incorporates several open-source tools to operationalize Kubernetes at scale to support microservice deployment on AWS:
Karpenter
At athenahealth, Karpenter actively manages node provisioning across production EKS clusters, handling diverse workload requirements from multiple development teams. For example, a medical records processing applications experience variable loads throughout the day, and Karpenter automatically scales node pools from 50 to 200 instances during peak hours, then scales down during quiet periods – achieving 40% cost savings compared to static provisioning.
Keda
At athenahealth, Keda manages real-time scaling of message processing microservices, automatically adjusting pod counts based on SQS queue depth and message latency. For instance, the patient registration service scales from 5 to 50 pods during morning registration peaks, then scales to zero during off-hours, resulting in 35% resource savings while maintaining sub-second processing times.
Kubecost
Cost management is a critical aspect of running Kubernetes clusters, especially when operating central EKS clusters where different teams will deploy their application containers. Kubecost provides insights into resource utilization and associated costs, helping athenahealth optimize spending and allocate resources effectively. By identifying underutilized resources and optimizing workloads, Kubecost contributes to overall operational efficiency. Kubecost provides granular cost visibility and optimization capabilities through its chargeback implementation system. The automated monthly cost reports deliver clear transparency to business units, enabling them to track their resource consumption and maintain accountability for associated costs. The platform’s real-time cost monitoring allows athenahealth’s platform team to detect cost anomalies within hours rather than weeks, preventing budget overruns. Teams receive daily utilization reports and cost allocation metrics, fostering a cost-conscious development culture while maintaining operational efficiency.
Ensuring Isolation and Security
athenahealth follows the AWS shared responsibility model for EKS. In this model, AWS handles “Security of the Cloud,” which includes protecting the infrastructure, Kubernetes control plane, and nodes. athenahealth, in turn, is responsible for “Security in the Cloud,” implementing comprehensive security controls for their applications and workloads running on EKS. To maintain isolation and security within the centralized EKS environment, athenahealth employs Kubernetes namespaces and resource boundaries. These mechanisms ensure that different teams can work independently without interfering with each other’s resources. For enhanced access control, athenahealth implements ABAC using tags and IAM conditions. This ABAC implementation allows for fine-grained access management based on multiple attributes such as:
- Team ownership tags: Restricting access to resources based on team identifiers
- Environment attributes: Controlling access based on development, staging, or production environments
- Application classification: Managing permissions based on application security levels
- Cost center assignments: Limiting resource access based on business unit allocations
athenahealth’s EKS Transformation Outcomes
athenahealth’s journey to modernize its application architecture using EKS and open-source tools provides valuable insights for organizations aiming to achieve similar goals. Using EKS at scale in a hybrid fashion has enabled athenahealth to accelerate its modernization journey by providing the following:
Reduced Operational Burden
Centralized Kubernetes management has significantly reduced the operational burden on athenahealth’s DevOps teams. By managing fewer clusters with standardized configurations and security policies, the team eliminated repetitive tasks such as cluster version upgrades, security patch management, node scaling configurations, and network policy maintenance across multiple clusters. This consolidation reduced the time spent on routine cluster operations by 60%, freeing DevOps teams to focus on strategic initiatives like platform automation, service mesh implementation, and performance optimization. Instead of managing individual cluster configurations for each team, the centralized approach allows for consistent policy enforcement and unified monitoring across all workloads.
Enhanced Developer Productivity
Developers at athenahealth now enjoy greater autonomy, allowing them to innovate and deliver new features rapidly. The reduction in infrastructure management overhead has streamlined the development process, leading to faster time-to-market for new applications and features.
Optimized Resource Utilization
The strategic selection of Karpenter, Keda, and Kubecost optimizes athenahealth’s Kubernetes clusters through superior performance characteristics. Karpenter offers 60% faster node provisioning than Cluster Autoscaler, Keda enables more flexible scaling with 50+ scalers compared to standard HPA, and Kubecost provides Kubernetes-native cost insights unavailable in traditional monitoring tools – all contributing to precise resource matching and cost optimization. Kubecost’s recommendations for spot instances and resource requests optimization delivered an additional 20% reduction in compute costs.
Improved Hybrid Cloud Capabilities
athenahealth’s hybrid cloud architecture enables seamless connectivity between Kubernetes containers and resources across various environments. This flexibility ensures that applications can be deployed where they are most effective, enhancing performance and resilience.
Conclusion
For organizations looking to modernize their applications with microservices, athenahealth’s journey offers a detailed, actionable blueprint for enhancing operational efficiency and developer productivity. The journey begins with centralizing EKS cluster management through Crossplane, which enables GitOps integration and prevents configuration drift. Organizations should then focus on resource optimization leveraging open-source tools: Karpenter for efficient cluster scaling, Keda for event-driven workloads, and Kubecost. A clear separation of responsibilities is essential, with DevOps teams owning cluster management while developers maintain autonomy through namespace isolation and ABAC. The architecture should embrace hybrid deployment using AWS Outposts and Local Zones to optimize performance and latency with key technical components including Kong Ingress controllers for traffic management and standardized network configurations with public and private subnets across regions. Finally, standardizing configurations across AWS regions ensures consistent operations. This comprehensive approach, coupled with the strategic use of open-source tools, can unlock significant efficiency gains and drive innovation in cloud-native application development.