Simplifying Cloud Operator Workflows with Red Hat Enterprise Linux Mixed-Mode Upgrade
When we are talking about large-scale deployments, upgrading an OpenStack environment can be a complex and time-consuming task for operators. To improve their experience and ensure a smooth upgrade process, we have introduced a mixed-mode RHEL upgrade in Red Hat OpenStack Platform 17.1.
This solution addresses a common challenge faced by operators: the lack of time to complete upgrades within the planned maintenance window. By separating the OpenStack environment upgrade from the OS upgrade, operators can now efficiently upgrade OpenStack content during the scheduled maintenance window, while upgrading the OS at a later time.
In this blog, we dive into the details of Red Hat’s approach and highlight the benefits it brings to OpenStack operators.
Benefits for OpenStack Cloud Operators
Optimize time: Splitting the upgrade process into two distinct phases allows operators to maximize the use of their maintenance window. They can then prioritize the upgrade of the OpenStack environment and ensure a timely completion of critical updates, all while reserving additional time for the operating system to upgrade at a later stage.
Minimize downtime: Separating OpenStack upgrades minimizes the impact on end users and services. By focusing solely on OpenStack content during the maintenance window, operators can reduce potential downtime associated with major system changes and provide a more seamless upgrade experience.
Reduce operational risk: A separate upgrade approach reduces the risk of encountering unforeseen compatibility issues between the OpenStack environment and the operating system. Operators can thoroughly test and validate each upgrade independently, minimizing the likelihood of compatibility conflicts.
Improve stability: Operators are able to upgrade OpenStack content separately and Red Hat’s solution ensures a stable and reliable environment during the upgrade process. The modular approach helps prevent disruptions to critical services, maintains the integrity of the infrastructure, and minimizes the risk of performance degradation.
Let’s look at the new upgrade process, with an example:
The example below includes the following components:
- Red Hat OpenStack Platform director node running version 16.2 and RHEL 8.4
- 3 Red Hat OpenStack Platform controller nodes running version 16.2 and RHEL 8.4
- 3 Red Hat OpenStack Platform compute nodes running version 16.2 and RHEL 8.4
- 3 Ceph storage nodes running Red Hat Ceph Storage 4 and RHEL 8
The first step is to upgrade all Red Hat OpenStack Platform content to 17.1. The Red Hat OpenStack Platform director is always the first node to be upgraded:
In the next step, Ceph storage nodes will be upgraded from Red Hat Ceph Storage 4 to Red Hat Ceph Storage 5:
Finally, Red Hat OpenStack Platform content will be upgraded to 17.1 across all Overcloud nodes: Red Hat OpenStack Platform controllers and Red Hat OpenStack Platform computes.
As we move three release versions of OpenStack, access to public-facing APIs should be blocked because requests may fail during the transition:
After this step, the entire Red Hat OpenStack Platform cluster will be running version 17.1.
Now it's time to focus on upgrading all of our systems to RHEL 9.
As in the previous process, the Red Hat OpenStack Platform director is the first node to be upgraded to RHEL 9.2. The undercloud upgrade process uses Leapp to upgrade the RHEL system from RHEL 8.4 to 9.2. The node then must be rebooted:
Red Hat OpenStack Platform controllers are the next to be upgraded, with the OpenStack overcloud process orchestrating the upgrade and reboot of each controller in serial:
Upgrading RHEL in the Ceph storage nodes follows a similar process to Red Hat OpenStack Platform controller nodes:
Lastly, when upgrading Red Hat OpenStack Platform compute nodes, an operator may be willing to live migrate the virtual machine running the workloads to another Red Hat OpenStack Platform compute node. In the example below, VM1 running in Red Hat OpenStack Platform Compute 1 is live migrated to the Red Hat OpenStack Platform Compute 2 node to prevent the workload outage when the RHEL system is rebooted.
With no workload running on Red Hat OpenStack Platform Compute 1, it is now safe to run the OpenStack overcloud upgrade process as shown in the next diagram:
As the Red Hat OpenStack Platform Compute 1 upgrade is now complete, the workload can be live migrated back from Red Hat OpenStack Platform Compute 2 to the original Red Hat OpenStack Platform Compute 1:
In the example below, the Red Hat OpenStack Platform Compute 1 node was upgraded independently, but the OpenStack upgrade process also allows Red Hat OpenStack Platform compute nodes to be upgraded in batches.
The multi-version support for RHEL introduced in Red Hat OpenStack Platform 17.1 allows us to run Red Hat OpenStack Platform 17.1 compute nodes on RHEL 8.4 and RHEL 9.2 simultaneously, as shown in the diagram below, where the Red Hat OpenStack Platform Compute node 2 RHEL system upgrade has been deferred to a later stage:
Decoupling the Red Hat OpenStack Platform environment upgrade from the operating system upgrade addresses the time constraints operators often face during maintenance windows. This novel approach allows operators to complete OpenStack content upgrades within the scheduled window, while deferring the OS upgrade to a more convenient time.
With optimized time management, reduced downtime, and improved operational efficiency, our approach enables operators to increase the stability and reliability of their OpenStack environments. Upgrade your Red Hat OpenStack Platform infrastructure with confidence and unlock the full potential of your cloud with this breakthrough solution.