Red Hat Advanced Cluster Management for Kubernetes (RHACM) defines two main types of clusters: hub clusters and managed clusters.
The hub cluster is the main cluster with RHACM installed on it. You can create, manage, and monitor other Kubernetes clusters with the hub cluster. The managed clusters are Kubernetes clusters that are managed by the hub cluster. You can create some clusters by using the RHACM hub cluster, and you can also import existing clusters to be managed by the hub cluster. Since the hub cluster manages the cluster fleet, it is vital that there is a business continuity scenario built in so that when an unexpected event causes a hub cluster to fail, the cluster fleet can be managed by a new hub cluster.
The managed clusters are Kubernetes clusters that are managed by the hub cluster. You can create some clusters by using the RHACM hub cluster, and you can also import existing clusters to be managed by the hub cluster. Since the hub cluster manages the cluster fleet, it is vital that there is a business continuity scenario built in for when the hub cluster fails, the cluster fleet can be managed by a new hub cluster.
The RHACM backup and restore feature, available starting with version 2.5, offers support for building a Disaster Recovery solution to recover the hub cluster when it fails. There is a shortcoming for this feature though: only managed clusters created using the Hive API are automatically connected to the restored hub cluster. Imported managed clusters must be manually reconnected on the new hub cluster.
RHACM 2.7 provides a solution to automatically import managed clusters when restoring on a new hub cluster.
The purpose of this blog is to provide a walk-through on how to enable and make use of the solution available with RHACM 2.7 to automatically import managed clusters on a restore hub cluster operation. Before showing how to use the auto import feature available with RHACM 2.7, let's see why this approach is needed in the first place.
Why imported clusters must be manually reimported after restore
When the backup data is moved to another hub cluster, only Hive managed clusters are automatically connected with the new hub cluster. Hive clusters are managed clusters created on the hub cluster using the Create cluster action available from the Clusters tab in the console.
Managed clusters connected with the initial hub cluster by using the Import cluster action appear as Pending Import when the hub cluster data is restored on a new hub cluster, and the clusters must be manually imported back on the new hub cluster.
Hive managed clusters are automatically connected with the new hub cluster because Hive stores the managed cluster
kubeconfig in the managed cluster namespace on the hub cluster, and this is being backed up and restored on the new hub cluster. The import controller updates the bootstrap
kubeconfig on the managed cluster using this restored configuration. This information is only available for managed clusters created by using the Hive API and is not available for imported clusters.
The workaround provided with RHACM 2.5 and RHACM 2.6 for reconnecting imported clusters with the new hub cluster is to manually create the
auto-import-secret after the restore operation is started. The
auto-import-secret must be created on the restore hub cluster in the managed cluster namespace, for each cluster in Pending Import state. This
auto-import-secret must use a
kubeconfig or token with enough permissions for the import component to start the auto import on the new hub cluster.
For a large number of imported managed clusters, this is a very tedious operation since it is ran manually for each managed cluster. It increases the Recovery Time Objective time and requires the user, who runs the restore operation, to establish access between each managed cluster and a token that can be used to connect with the managed cluster. This token must have a
klusterlet role binding or a role with equivalent permissions.
Automatically reconnecting managed clusters with RHACM 2.7
Continue reading the new solution for automatically connecting imported clusters to the new hub cluster by using the
ManagedServiceAccount feature, available with the backup and restore component in RHACM 2.7. The following sections show you how to enable this feature with RHACM 2.7 and explain possible limitations.
How the automatic connection works
The backup controller available with RHACM 2.7 uses the ManagedServiceAccount component on the primary hub cluster to create a token for each of the imported managed clusters.
This token is backed up in each managed cluster namespace and is set to use a
ClusterRole binding, which allows the token to be used when importing the managed cluster with the auto import secret. The
ClusterRole can only get or update the
bootstrap-hub-kubeconfig secret, so there is limited access to the managed cluster.
When the activation data is restored on the new hub cluster, the restore controller runs a post restore operation and looks for all managed clusters in the Pending import state. For these managed clusters, it checks if there is a valid token generated by the
ManagedServiceAccount and, if found, creates an
auto-import-secret by using this token. As a result, the cluster import component tries to reconnect the managed cluster, and if the cluster is accessible, the operation is successful.
Automatic import value
When the hub cluster backup data is restored on a new hub cluster, all managed clusters are automatically connected with the new hub cluster.
See the following prerequisites to follow along in this blog.
For both active and passive hub clusters:
RHACM version 2.7 or later must be installed on your hub cluster. See the following screen capture:
MultiClusterEngineby editing the
MultiClusterEngineresource and setting
enabled: truefor the
managedserviceaccount-previewcomponent. See the following exmaple:
- enabled: true
cluster-backupOperator on the hub cluster. Edit the
MultiClusterHubresource and set
enabled: truefor the
cluster-backupcomponent. This also installs the
OADP operatorin the
open-cluster-management-backupnamespace. See the following example:
- enabled: true
You must create the
DataProtectionApplicationresource in the
open-cluster-management-backupnamespace and point to a valid storage location for backups.
Enabling the automatic import feature on active hub cluster
To enable the automatic import feature, set the
useManagedServiceAccount property to
true when creating the
BackupSchedule.cluster.open-cluster-management.io resource on the active hub cluster. See the following example:
veleroSchedule: 0 */1 * * *
useManagedServiceAccount is set to
true, the backup controller will start processing imported managed clusters and for each of them:
auto-import-accountand sets the token validity as defined by the
ManagedServiceAccountresource is processed by the
ManagedServiceAddonwhich triggers on the managed cluster the creation of a token with the same name. This token is pushed back on the hub under the managed cluster namespace.
Managed Service Account token on managed cluster:
Note that the token is created only if the managed cluster is accessible. If the managed cluster is not accessible at the time the
ManagedServiceAccountis created, the token is created at a later time when the managed cluster becomes available. This hub cluster secret gets backed up.
Managed Service Account token on hub cluster:
- For each of the
ManagedServiceAccountresources, the backup controller creates a
ManifestWorkused to setup on the managed cluster, a
ClusterRolecan only get or update the
bootstrap-hub-kubeconfigsecret. This role is going to be used in a backup restore post operation, to auto import the managed cluster on the restored hub cluster.
Managed Service Account role binding on managed cluster:
You can disable the automatic import cluster feature at any time by setting the
BackupScheduleresource. Removing the property has the same result since the default value is set to
When you disable the automatic import cluster feature, the backup controller removes the following resources created:
ManifestWork, which in turn will delete the auto import token, on the hub cluster and managed cluster:
veleroSchedule: 0 */1 * * *
auto-import-accounttoken validity duration is automatically set to be twice the value of
veleroTtl, to maximize the chance of the token being valid for all backups storing the token for their entire lifecycle. You can choose to change this value if you want to control how long a token should be valid, but keep in mind that this could result in producing backups with tokens set to expire during the lifecycle of the backup. Use the
managedServiceAccountTTLproperty to change the token TTL:
veleroSchedule: 0 */2 * * *
- For each of the
Automatically reconnect imported clusters on restore hub cluster
The backup data is restored on the new hub cluster using a
Restore resource, as shown in the following example:
When the managed cluster backup data is restored on the new hub cluster, the restore controller runs a post restore operation and looks for all managed clusters in Pending import state.
For these managed clusters, it checks whether there is a valid
auto-import-account token under the managed cluster namespace on the new hub. If such token is found, the post restore routine creates an
auto-import-secret using this token.
As a result, the cluster import component tries to reconnect the managed cluster and if the cluster is accessible the operation is successful.
You should see the following status message for the Restore resource if the post restore operation has created an
auto-import-secret secret, triggering the auto import operation for a managed cluster in
Pending Import state:
lastMessage: Velero restores have run to completion
- Created auto-import-secret for managed cluster (vb-managed-cls-1)
Limitations with the automatic import feature
There are a set of limitations with the above approach which could result in the managed cluster not being auto imported when moving to a new hub. These are the situations that can result in the managed cluster not being imported:
Since the automatic import operation is making use of the cluster import feature using the auto import secret, it is required that the hub is able to access the managed cluster and run the cluster import operation.
auto-import-secretcreated on restore uses the
ManagedServiceAccounttoken to connect to the managed cluster, the managed cluster must also provide the kube
apiservermust be set on the
ManagedClusterresource as in the sample below. Only OCP clusters have this
apiserversetup automatically when the cluster is imported on the hub. For any other type of managed clusters, such as EKS clusters, this information must be set manually by the user, otherwise the automatic import feature will ignore these clusters and they stay in
Pending Importwhen moved to the restore hub cluster:
The backup controller is regularly looking for imported managed clusters and it creates the ManagedServiceAccount resource under the managed cluster namespace as soon as such managed cluster is found. This should trigger a token creation on the managed cluster. If the managed cluster is not accessible at the time this operation is executed though, for example the managed cluster is hibernating or is down, the
ManagedServiceAccountis unable to create the token. As a result, if a hub backup is run at this time, the backup will not contain a token to auto import the managed cluster.
It is possible for a
ManagedServiceAccountsecret to not be included in a backup if the backup schedule runs before the backup label is set on the
ManagedServiceAccountsecrets don't have the
cluster.open-cluster-management.io/backuplabel set on creation. For this reason, the backup controller looks regularly for
ManagedServiceAccountsecrets under the managed clusters namespaces, and adds the backup label if not found.
auto-import-accountsecret token is valid and is backed up but the restore operation is run at a time when the token available with the backup has already expired, the auto import operation fails. In this case, the
restore.cluster.open-cluster-management.ioresource status should report the invalid token issue for each managed cluster in this situation.
This blog describes how to use the cluster backup and restore operator available with RHACM 2.7 to automatically reconnect imported managed clusters to the new hub after a restore operation. It shows how to enable the automatic connect feature and how it works.
- Backup and Restore Hub Clusters with Red Hat Advanced Cluster Management for Kubernetes - blog
- RHACM Backup and Restore imported managed clusters with RHACM 2.6 documentation
- Importing the cluster with the auto import secret
- RHACM Backup and Restore automatic import of managed clusters with RHACM 2.7 documentation
- ManagedServiceAccount add-on framework
Kubernetes, How-tos, Red Hat Advanced Cluster Management, Multi-Cluster