Purpose

The purpose of this procedure is to show how to replace a failed Control Plane node in a Bare metal OpenShift cluster (3+0 or 3+N ) in a simple way. This methodology is based on IPI and will allow you to replace a failed Control node quickly.

Prerequisites

  • Existing Bare metal cluster installed either with IPI or ABI or Asisted Installer using OCP >= 4.12.1.
  • Be surethat all the required DNS records exist.
  • You have access to the cluster as a user with the cluster-admin role.
  • You have taken an etcd backup.
  • Bare metal Operator is available ($ oc get clusteroperator baremetal).
  • Server boot mode set to UEFI and Redfish multimedia is supported.

Replacing a Master node

Here we will be replacing master-2 with master-x. To simulate the node faillure we will shutdown master-2.

Pre-check validation

$ oc get nodes
NAME STATUS ROLES AGE VERSION
master-0 Ready control-plane,master,worker 31m v1.25.8+27e744f
master-1 Ready control-plane,master,worker 58m v1.25.8+27e744f
master-2 NotReady control-plane,master,worker 58m v1.25.8+27e744f


Control Node replacement

Please check the Official document for details of how to remove an unhealthy etcd member can be found here Remove-Unhealth-ETCD

Remove Unhealthy ETCD Master-1 Member

  • Checking ETCD Member Status
$ oc -n openshift-etcd rsh etcd-master-0 etcdctl member list -w table
+------------------+---------+----------+----------------------------+----------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+----------+----------------------------+----------------------------+------------+
| c300d358075445b | started | master-0 | https://192.168.24.87:2380 | https://192.168.24.87:2379 | false |
| 1a7b6f4c3aac9be1 | started | master-1 | https://192.168.24.88:2380 | https://192.168.24.88:2379 | false |
| 6fd2f8909c811461 | started | master-2 | https://192.168.24.86:2380 | https://192.168.24.86:2379 | false |
+------------------+---------+----------+----------------------------+----------------------------+------------+
$ oc -n openshift-etcd rsh etcd-master-0 etcdctl endpoint health
{"level":"warn","ts":"2023-04-24T17:11:59.984Z","logger":"client","caller":"v3@v3.5.6/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00028c000/192.168.24.86:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
https://192.168.24.88:2379 is healthy: successfully committed proposal: took = 7.12757ms
https://192.168.24.87:2379 is healthy: successfully committed proposal: took = 7.216856ms
https://192.168.24.86:2379 is unhealthy: failed to commit proposal: context deadline exceeded
Error: unhealthy cluster
command terminated with exit code 1

Note: Take note master-2 member-ID for next steps

  • Remove ETCD master-2 Member-ID
$ oc -n openshift-etcd rsh etcd-master-0 
sh-4.4# etcdctl member list
c300d358075445b, started, master-0, https://192.168.24.87:2380, https://192.168.24.87:2379, false
1a7b6f4c3aac9be1, started, master-1, https://192.168.24.88:2380, https://192.168.24.88:2379, false
6fd2f8909c811461, started, master-2, https://192.168.24.86:2380, https://192.168.24.86:2379, false
sh-4.4# etcdctl member remove 6fd2f8909c811461
Member 6fd2f8909c811461 removed from cluster c413f45f7dfe9590
  • Check ETCD Member Status Again
    Note: Make sure master-x ETCD member is no longer shown on the status
sh-4.4# etcdctl member list -w table
+------------------+---------+----------+----------------------------+----------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+----------+----------------------------+----------------------------+------------+
| c300d358075445b | started | master-0 | https://192.168.24.87:2380 | https://192.168.24.87:2379 | false |
| 1a7b6f4c3aac9be1 | started | master-1 | https://192.168.24.88:2380 | https://192.168.24.88:2379 | false |
+------------------+---------+----------+----------------------------+----------------------------+------------+

List The Old Secrets for Unhealthy Master-2

$ oc get secret -n openshift-etcd | grep master-2
etcd-peer-master-2 kubernetes.io/tls 2 56m
etcd-serving-master-2 kubernetes.io/tls 2 56m
etcd-serving-metrics-master-2 kubernetes.io/tls 2 56m
  • Remove the old secrets for the unhealthy etcd member that was removed
$ oc get secrets -n openshift-etcd|grep master-2 |awk '{print $1}'|xargs oc -n openshift-etcd delete secrets
secret "etcd-peer-master-2" deleted
secret "etcd-serving-master-2" deleted
secret "etcd-serving-metrics-master-2" deleted

Check ETCD Status

$ oc get pods -n openshift-etcd | grep -v etcd-quorum-guard | grep etcd
etcd-master-0 5/5 Running 0 54m
etcd-master-2 2/5 NotReady 0 50m
etcd-master-1 5/5 Running 0 52m

Force the etcd redeployment

$ oc patch etcd cluster -p='{"spec": {"forceRedeploymentReason": "single-master-recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge

Delete Machine and BMH of the failed Master

$ oc delete machine master-2 -n n openshift-machine-api
$ oc delete bmh master-2 -n openshift-machine-api

Prepare to Delete master-2 Node

  • Check PODs status on Master-2 Node
$ oc get po -A -o wide|grep master-2'

It should be PODs still allocated to master-2, then follow next steps to clean them up.

  • Delete Master-2 Node
$ oc delete node master-2

Note: Please check this status again to make sure no more PODs allocated / running on master-2 anymore.
And also make sure that no more pods on master-2

$ oc get po -o wide -A|grep master-2|wc -l
0
$ oc get nodes
NAME STATUS ROLES AGE VERSION
master-0 Ready control-plane,master,worker 47m v1.25.8+27e744f
master-1 Ready control-plane,master,worker 75m v1.25.8+27e744f


Now we are ready to add the new control node

Preparing the bare metal node

To replace the failed master node, you can used either static or Dynamic IP configuration. When replacing a master node using a DHCP server, the node must have a DHCP reservation.

1- Make sure the node is poweroff ( new master)

2- Validate that you have the OC version that match the cluster version

3- Retrieve the user name and password of the bare metal node’s baseboard management controller. Then, create base64 strings from the user name and password:

  • echo -ne "root" | base64
  • echo -ne "password" | base64

Create a configuration file for the bare metal node**

$  cat <<EOF | oc apply -f -
apiVersion: v1
kind: Secret
metadata:
name: control-plane-3-bmc-secret
namespace: openshift-machine-api
data:
username: cm9fdd=
password: Y2Fsd==
type: Opaque
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: master-x
namespace: openshift-machine-api
spec:
automatedCleaningMode: disabled
bmc:
address: idrac-virtualmedia://192.168.24.159/redfish/v1/Systems/System.Embedded.1 #this is for dell server , for HP or other vendor check virtual media path
credentialsName: control-plane-3-bmc-secret
disableCertificateVerification: True
bootMACAddress: b8:ce:f6:56:a9:ea
bootMode: UEFI
externallyProvisioned: false
hardwareProfile: unknown
online: true
rootDeviceHints:
deviceName: /dev/sdb
EOF

Once the bare metal host got created, for master node you need to create the Machine

for the new master

$  cat <<EOF | oc apply -f -
apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
annotations:
metal3.io/BareMetalHost: openshift-machine-api/master-x
labels:
machine.openshift.io/cluster-api-cluster: abi-4c7mt
machine.openshift.io/cluster-api-machine-role: master
machine.openshift.io/cluster-api-machine-type: master
name: abi-4c7mt-master-x
namespace: openshift-machine-api
spec:
metadata: {}
providerSpec:
value:
apiVersion: baremetal.cluster.k8s.io/v1alpha1
customDeploy:
method: install_coreos
hostSelector: {}
image:
checksum: ""
url: ""
kind: BareMetalMachineProviderSpec
metadata:
creationTimestamp: null
userData:
name: master-user-data-managed
EOF

Bmh object get created and will transition to different status

  • Inspecting
  • Available
  • Provisioning
  • Provisionned
$ oc get bmh 
NAME STATE CONSUMER ONLINE ERROR AGE
master-0 unmanaged abi-4c7mt-master-0 true 93m
master-1 unmanaged abi-4c7mt-master-1 true 93m
master-x inspecting abi-4c7mt-master-x true 2m27s


Keep monitoring the bmh until status changed to “available”. In the meantime , Server will get booted using Virtual Media to install RHCOS.

$ oc get bmh master-x
NAME STATE CONSUMER ONLINE ERROR AGE
master-x inspecting abi-4c7mt-master-x true 3m13s

$ oc get machine
NAME PHASE TYPE REGION ZONE AGE
abi-4c7mt-master-0 Running 99m
abi-4c7mt-master-1 Running 99m
abi-4c7mt-master-x Provisioning 8m22s

$ oc get bmh master-x
NAME STATE CONSUMER ONLINE ERROR AGE
master-x provisioning abi-4c7mt-master-x true 11m

Node will get rebooted.Keep monitoring the BMH until status changed to “provisioned”

$ oc get bmh master-x
NAME STATE CONSUMER ONLINE ERROR AGE
master-x provisioned abi-4c7mt-master-x true 18m

$ oc get machine
NAME PHASE TYPE REGION ZONE AGE
abi-4c7mt-master-0 Running 110m
abi-4c7mt-master-1 Running 110m
abi-4c7mt-master-x Provisioned 18m

After two reboots new master node ( master-x here) should join the cluster automatically ( no CSR needs to be approved).

$ oc get bmh master-x
NAME STATE CONSUMER ONLINE ERROR AGE
master-x provisioned abi-4c7mt-master-x true 32m
$ oc get machine
NAME PHASE TYPE REGION ZONE AGE
abi-4c7mt-master-0 Running 123m
abi-4c7mt-master-1 Running 123m
abi-4c7mt-master-x Running 32m

Validation

Validate that all nodes are ready and that cluster is stable

$ oc get nodes
NAME STATUS ROLES AGE VERSION
master-0 Ready control-plane,master,worker 87m v1.25.8+27e744f
master-1 Ready control-plane,master,worker 114m v1.25.8+27e744f
master-x Ready control-plane,master,worker 2m19s v1.25.8+27e744f


$ oc get machine
NAME PHASE TYPE REGION ZONE AGE
abi-4c7mt-master-0 Running 124m
abi-4c7mt-master-1 Running 124m
abi-4c7mt-master-x Running 33m

$ oc get bmh
NAME STATE CONSUMER ONLINE ERROR AGE
master-0 unmanaged abi-4c7mt-master-0 true 125m
master-1 unmanaged abi-4c7mt-master-1 true 125m
master-x provisioned abi-4c7mt-master-x true 33m



$ oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.12.14 True False False 94m
baremetal 4.12.14 True False False 117m
cloud-controller-manager 4.12.14 True False False 126m
cloud-credential 4.12.14 True False False 137m
cluster-autoscaler 4.12.14 True False False 117m
config-operator 4.12.14 True False False 118m
console 4.12.14 True False False 97m
control-plane-machine-set 4.12.14 True False False 116m
csi-snapshot-controller 4.12.14 True False False 117m
dns 4.12.14 True False False 115m
etcd 4.12.14 True False False 116m
image-registry 4.12.14 True False False 106m
ingress 4.12.14 True False False 114m
insights 4.12.14 True False False 103m
kube-apiserver 4.12.14 True False False 97m
kube-controller-manager 4.12.14 True False False 115m
kube-scheduler 4.12.14 True False False 114m
kube-storage-version-migrator 4.12.14 True False False 117m
machine-api 4.12.14 True False False 114m
machine-approver 4.12.14 True False False 117m
machine-config 4.12.14 True False False 53m
marketplace 4.12.14 True False False 117m
monitoring 4.12.14 True False False 106m
network 4.12.14 True False False 117m
node-tuning 4.12.14 True False False 116m
openshift-apiserver 4.12.14 True False False 112m
openshift-controller-manager 4.12.14 True False False 113m
openshift-samples 4.12.14 True False False 109m
operator-lifecycle-manager 4.12.14 True False False 116m
operator-lifecycle-manager-catalog 4.12.14 True False False 116m
operator-lifecycle-manager-packageserver 4.12.14 True False False 112m
service-ca 4.12.14 True False False 118m
storage 4.12.14 True False False 118m

$ oc get clusterversions.config.openshift.io
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.12.14 True False 92m Cluster version is 4.12.14
  • Checking ETCD Member Status
$ oc get pods -n openshift-etcd
NAME READY STATUS RESTARTS AGE
etcd-guard-master-0 1/1 Running 0 95m
etcd-guard-master-1 1/1 Running 0 114m
etcd-guard-master-x 1/1 Running 0 10m
etcd-master-0 4/4 Running 0 7m15s
etcd-master-1 4/4 Running 0 9m7s
etcd-master-x 4/4 Running 0 5m11s
installer-10-master-x 0/1 Completed 0 13m
installer-11-master-0 0/1 Completed 0 8m22s
installer-11-master-1 0/1 Completed 0 10m
installer-11-master-x 0/1 Completed 0 6m23s
installer-7-master-0 0/1 Completed 0 98m
installer-9-master-0 0/1 Completed 0 91m
installer-9-master-1 0/1 Completed 0 92m
revision-pruner-10-master-0 0/1 Completed 0 13m
revision-pruner-10-master-1 0/1 Completed 0 13m
revision-pruner-10-master-x 0/1 Completed 0 13m
revision-pruner-11-master-0 0/1 Completed 0 10m
revision-pruner-11-master-1 0/1 Completed 0 10m
revision-pruner-11-master-x 0/1 Completed 0 10m
revision-pruner-7-master-0 0/1 Completed 0 98m
revision-pruner-7-master-1 0/1 Completed 0 98m
revision-pruner-8-master-0 0/1 Completed 0 95m
revision-pruner-8-master-1 0/1 Completed 0 95m
revision-pruner-9-master-0 0/1 Completed 0 94m
revision-pruner-9-master-1 0/1 Completed 0 94m
revision-pruner-9-master-x 0/1 Completed 0 14m


$ oc -n openshift-etcd rsh etcd-master-0 etcdctl member list -w table
+------------------+---------+----------+----------------------------+----------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+----------+----------------------------+----------------------------+------------+
| c300d358075445b | started | master-0 | https://192.168.24.87:2380 | https://192.168.24.87:2379 | false |
| 1a7b6f4c3aac9be1 | started | master-1 | https://192.168.24.88:2380 | https://192.168.24.88:2379 | false |
| b24ac36103e976e3 | started | master-x | https://192.168.24.91:2380 | https://192.168.24.91:2379 | false |
+------------------+---------+----------+----------------------------+----------------------------+------------+
$ oc -n openshift-etcd rsh etcd-master-0 etcdctl endpoint health
https://192.168.24.87:2379 is healthy: successfully committed proposal: took = 8.642798ms
https://192.168.24.91:2379 is healthy: successfully committed proposal: took = 8.640762ms
https://192.168.24.88:2379 is healthy: successfully committed proposal: took = 8.938068ms