AWS Big Data Blog
Best practices for upgrading Amazon MWAA environments
Amazon Managed Workflows for Apache Airflow (Amazon MWAA) has become a cornerstone for organizations embracing data-driven decision-making. As a scalable solution for managing complex data pipelines, Amazon MWAA enables seamless orchestration across AWS services and on-premises systems. Although AWS manages the underlying infrastructure, you must carefully plan and execute your Amazon MWAA environment updates according to the shared responsibility model. Upgrading to the latest Amazon MWAA version can provide significant advantages, including enhanced security through critical security patches and potential improvements in performance with faster DAG parsing and reduced database load. You can use advanced features while maintaining ecosystem compatibility and receiving prioritized AWS support. The key to successful upgrades lies in choosing the right solution and following a methodical implementation approach.
In this post, we explore best practices for upgrading your Amazon MWAA environment and provide a step-by-step guide to seamlessly transition to the latest version.
Solution overview
Amazon MWAA provides two primary upgrade solutions:
- In-place upgrade – This method works best when you can accommodate planned downtime. You deploy the new version directly on your existing infrastructure. In-place version upgrades on Amazon MWAA are supported for environments running Apache Airflow version 2.x and later. However, if you’re running version 1.10.z or older versions, you must create a new environment and migrate your resources, because these versions don’t support in-place upgrades.
- Cutover upgrade – This method helps minimize disruption to production environments. You create a new Amazon MWAA environment with the target version and then transition from your old environment to the new one.
Each solution offers a different approach to help you upgrade while working to maintain data integrity and system reliability.
In-place upgrade
In-place upgrades work well for environments where you can schedule a maintenance window for the upgrade process. During this window, Amazon MWAA preserves your workflow history. This method works best when you can accommodate planned downtime. It helps maintain historical data, provides a straightforward upgrade process, and includes rollback capabilities if issues occur during provisioning. You also use fewer resources because you don’t need to create a new environment.
You can perform in-place upgrades through the AWS Management Console with a single operation. This process helps reduce operational overhead by managing many upgrade steps for you.
During the upgrade process, your environment can’t schedule or run new tasks. Amazon MWAA helps manage the upgrade process and implements safety measures—if issues occur during the provisioning phase, the service attempts to revert to the previous stable version.
Before you begin an in-place upgrade, we recommend testing your DAGs for compatibility with the target version, because DAG compatibility issues can affect the upgrade process. You can use the Amazon MWAA local runner to test DAG compatibility before you start the upgrade. You can start the upgrade using either the console and specifying the new version or the AWS Command Line Interface (AWS CLI). The following is an example Amazon MWAA upgrade command using the AWS CLI:
The following diagram shows the Amazon MWAA in-place upgrade workflow and states.
Refer to Introducing in-place version upgrades with Amazon MWAA for more details.
Cutover upgrade
A cutover upgrade provides an alternative solution when you need to minimize downtime, though it requires more manual steps and operational planning. With this approach, you create a new Amazon MWAA environment, migrate your metadata, and manage the transition between environments. Although this method offers more control over the upgrade process, it requires additional planning and execution effort compared to an in-place upgrade.
This method can work well for environments with complex workflows, particularly when you plan to make significant changes alongside the version upgrade. The approach offers several benefits: you can minimize production downtime, perform comprehensive testing before switching environments, and maintain the ability to return to your original environment if needed. You can also review and update your configurations during the transition.
Consider the following aspects of the cutover approach. When you run two environments simultaneously, you pay for both environments. The pricing for each Amazon MWAA environment depends on:
- Duration of environment uptime (billed hourly with per-second resolution)
- Environment size configuration
- Automatic scaling capacity for workers
- Scheduler capacity
AWS calculates the cost of additional automatic scaled workers separately. You can estimate costs for your specific configuration using the AWS Pricing Calculator.
To help prevent data duplication or corruption during parallel operation, we recommend implementing idempotent DAGs. The Airflow scheduler automatically populates some metadata tables (dag, dag_tag, and dag_code) in your new environment. However, you need to plan the migration of the following additional metadata components:
- DAG history
- Variables
- Slot pool configurations
- SLA miss records
- XCom data
- Job records
- Log tables
You can choose this approach when your requirements prioritize minimal downtime and you can manage the additional operational complexity.
The cutover upgrade process involves three main steps: creating a new environment, restoring it with the existing data, and performing the upgrade. The following diagram illustrates the full workflow.
In the following sections, we walk through the key steps to perform a cutover upgrade.
Prerequisites
Before you begin the upgrade process, complete the following steps:
- Review the Airflow release notes and Apache Airflow versions on Amazon Managed Workflows for Apache Airflow.
- Back up your current environment configuration and metadata. You can use the mwaa-dr PyPI package to create and run a backup DAG to store your metadata in an Amazon Simple Storage Service (Amazon S3) bucket. For more details, see Working with DAGs on Amazon MWAA.
- Test your DAG compatibility. You can use the Amazon MWAA local runner to verify your DAGs, requirements, and plugins.
- Create a test environment for compatibility testing. For guidance, see Amazon MWAA best practices for managing Python dependencies.
Create a new environment
Complete the following steps to create a new environment:
- Generate a template for your new environment configuration using the AWS CLI:
- Modify the generated JSON file:
- Copy configurations from your backup file <env-name>.json to <new-env-name>.json.
- Update the environment name.
- Keep the AirflowVersion parameter value from your existing environment.
- Review and update other configuration parameters as needed.
- Create your new environment:
Restore the new environment
Complete the following steps to restore the new environment:
- Use the mwaa-dr PyPI package to create and run the restore DAG.
- This process copies metadata from your S3 backup bucket to the new environment.
- Verify that your new environment contains the expected metadata from your original environment.
Perform the version upgrade
Complete the following steps to perform the version upgrade:
- Upgrade your environment:
- Monitor the upgrade:
- Track the environment status on the console.
- Watch for error messages or warnings.
- Verify the environment reaches the AVAILABLE
Plan your transition timing carefully. When your original environment continues to process workflows during this upgrade, the metadata between environments can change.
Clean up
After you verify the stability of your upgraded environment through monitoring, you can begin the cleanup process:
- Remove your original Amazon MWAA environment using the AWS CLI command:
- Clean up your associated resources by removing unused backup data from S3 buckets, deleting temporary AWS Identity and Access Management (IAM) roles and policies created for the upgrade, and updating your DNS or routing configurations.
Before removing any resources, make sure you follow your organization’s backup retention policies, maintain necessary backup data for your compliance requirements, and document configuration changes made during the upgrade.
This approach helps you perform a controlled upgrade with opportunities for testing and the ability to return to your original environment if needed.
Monitoring and validation
You can track your upgrade progress using Amazon CloudWatch metrics, with a focus on DAG processing metrics and scheduler heartbeat. Your environment transitions through several states during the upgrade process, including UPDATING and CREATING. When your environment shows the AVAILABLE state, you can begin validation testing. We recommend checking system accessibility, testing critical workflow operations, and verifying external connections. For detailed monitoring guidance, see Monitoring and metrics for Amazon Managed Workflows for Apache Airflow.
Key considerations
Consider using infrastructure as code (IaC) practices to help maintain consistent environment management and support repeatable deployments. Schedule metadata backups using mwaa-dr during periods of low activity to help protect your data. When designing your workflows, implement idempotent pipelines to help manage potential interruptions, and maintain documentation of your configurations and dependencies.
Conclusion
A successful Amazon MWAA upgrade starts with selecting an approach that aligns with your operational requirements. Whether you choose an in-place or cutover upgrade, thorough preparation and testing help support a controlled transition. Using available tools, monitoring capabilities, and recommended practices can help you upgrade to the latest Amazon MWAA features while working to maintain your workflow operations.
For additional details and code examples on Amazon MWAA, refer to the Amazon MWAA User Guide and Amazon MWAA examples GitHub repo.
Apache, Apache Airflow, and Airflow are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.
About the Authors
Anurag Srivastava works as a Senior Big Data Cloud Engineer at Amazon Web Services (AWS), specializing in Amazon MWAA. He’s passionate about helping customers build scalable data pipelines and workflow automation solutions on AWS.
Sriharsh Adari is a Senior Solutions Architect at Amazon Web Services (AWS), where he helps customers work backwards from business outcomes to develop innovative solutions on AWS. Over the years, he has helped multiple customers on data platform transformations across industry verticals. His core area of expertise include Technology Strategy, Data Analytics, and Data Science. In his spare time, he enjoys playing sports, binge-watching TV shows, and playing Tabla.
Venu Thangalapally is a Senior Solutions Architect at AWS, based in Chicago, with deep expertise in cloud architecture, data and analytics, containers, and application modernization. He partners with Financial Services industry customers to translate business goals into secure, scalable, and compliant cloud solutions that deliver measurable value. Venu is passionate about leveraging technology to drive innovation and operational excellence. Outside of work, he enjoys spending time with his family, reading, and taking long walks.
Chandan Rupakheti is a Senior Solutions Architect at AWS. His main focus at AWS lies in the intersection of analytics, serverless, and AdTech services. He is a passionate technical leader, researcher, and mentor with a knack for building innovative solutions in the cloud. Outside of his professional life, he loves spending time with his family and friends, and listening to and playing music.