Red Hat blog

Using GitOps to Deploy Bare Metal OpenShift Hosted Control Plane Clusters

February 28, 2023Logan McNaughton

In a distributed bare metal environment, there have traditionally been two options for control plane architecture when dealing with large-scale OpenShift deployments:

Run one control plane at a centralized location, with many remote workers.
Run the control plane on each remote node (Single Node OpenShift).

Hosted Control Planes, based on the upstream HyperShift project, is currently in tech preview for bare metal, offering us a third option:

Run many containerized control planes at a centralized location, each connected to a small number of remote workers.

This approach can offer a few benefits:

Reduced resource consumption on the remote nodes compared to Single Node OpenShift.
Faster deployment times at the remote sites compared to Single Node OpenShift.
A much more scalable architecture when compared to a single control plane with many remote workers. Rather than having one control plane (limited to three nodes) serving hundreds of workers, you can deploy hosted control planes as needed, which run as pods on worker nodes as part of a centralized cluster. The number of worker nodes in this centralized cluster can be scaled as demand dictates.

A previous blog post has shown how to create Hosted Control Plane clusters using the CLI. In a production environment, you need something that allows you to scale your deployments consistently and reliably. This is where the OpenShift GitOps Operator, based on ArgoCD, comes in.

Setting up the management cluster

My management cluster consists of three bare metal servers. The cluster has ODF installed, which provides persistent storage. Persistent storage (meaning an available StorageClass) is required for ACM/MCE, as well as HyperShift (used to store the hosted cluster etcd database).

ACM/MCE

The management cluster requires either RHACM or the Multicluster Engine Operator. Once either of these operators is installed, we will enable the hypershift-preview feature, as well as CIM (Central Infrastructure Management), to allow zero-touch provisioning of bare metal worker nodes. Using CIM to provision bare metal workers is known as the Agent platform when dealing with Hosted Control Planes.

See this document for instructions on how to enable the hypershift-preview feature. Since ACM 2.7/MCE 2.2, the HyperShift Operator is automatically installed on the local-cluster after it is enabled.

See this document for instructions on how to enable Central Infrastructure Management (the Assisted Service). This feature will allow the management cluster to provision bare metal worker nodes by automatically mounting the installer ISO and installing OpenShift.

MetalLB

Since the control plane for our new cluster will be running as pods on the management cluster, we will use MetalLB to expose the API endpoint for the new cluster.

See this document for instructions on how to install the MetalLB Operator.

GitOps

See this document for instructions on how to install the GitOps Operator.

We will provide an install-config.yaml file as the values for a Helm chart that will deploy a hosted cluster on top of our management cluster. The GitOps Operator will apply the Helm chart for us and ensure that the specified configuration stays in sync with what is running on the management cluster.

DNS

The hosted cluster still needs DNS entries for api.<cluster-name>.<domain> and *.apps.<cluster-name>.<domain>.

Since we will be serving the API using MetalLB (layer 2) on the management cluster, the address we choose for api.<cluster-name>.<domain> needs to be in the same subnet as the management cluster.

*.apps.<cluster-name>.<domain> will be served using MetalLB running on the hosted cluster workers, therefore, that IP address needs to be in the same subnet as the hosted cluster workers.

Creating an Install Config

We can use a regular install-config.yaml file as the input to our Helm chart, with just a few additional parameters. Since the control plane will be hosted on the management cluster, we only need to specify worker nodes in our install config.

The install-config.yaml file needs to be stored in a Git repository. Since this file contains sensitive information, we'll keep it in a private repository that requires authentication to access.

apiVersion: v1

# Additional parameters for hypershift-helm
hypershift:
  clusterImageSet: quay.io/openshift-release-dev/ocp-release:4.12.2-x86_64 # Required

baseDomain: <cluster_domain>
compute:
  - name: worker
    replicas: 1
metadata:
  name: example-cluster-name
networking:
  clusterNetwork:
    - cidr: 10.128.0.0/14
      hostPrefix: 23
  networkType: OVNKubernetes
  serviceNetwork:
    - 172.30.0.0/16
platform:
  baremetal:
    apiVIP: <api_address> # Should be in the same subnet as the management cluster
    ingressVIP: <ingress_address> # Should be in the same subnet as the hosted cluster worker nodes
    hosts:
      - name: openshift-worker-0
        role: worker
        bmc:
          address: "redfish-virtualmedia://<bmc_ip_address>/redfish/v1/Systems/1"
          username: <username>
          password: <password>
        bootMACAddress: <nic1_mac_address>
        rootDeviceHints:
          hctl: "1:0:0:0"
        networkConfig:
          interfaces:
            - name: eno1
              type: ethernet
              macAddress: <nic1_mac_address>
              state: up
              ipv4:
                enabled: true
                dhcp: true
              ipv6:
                enabled: false
pullSecret: "<pull secret>"
sshKey: |
  ssh-rsa ...

Creating the ArgoCD app

Since the release of OpenShift GitOps 1.8, the operator has supported the ability to define multiple sources for an application. This is important for our Helm chart, since your Values file (install-config.yaml) will be in a different repository than the Helm chart itself.

Since the Git repository that hosts our install-config.yaml requires authentication, we need to configure it with credentials in ArgoCD first. See the ArgoCD docs for instructions on adding a private repository.

Once we have created our install-config.yaml and committed it to our private repository, we can create an ArgoCD app:

cat << EOF | oc apply -f -
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: hypershift-cluster
  namespace: openshift-gitops
spec:
  destination:
    server: https://kubernetes.default.svc
  project: default
  sources:
  - repoURL: 'https://loganmc10.github.io/hypershift-helm'
    chart: deploy-cluster
    targetRevision: 0.1.12
    helm:
      valueFiles:
      - $values/helm/install-config-hypershift.yaml
  - repoURL: 'https://github.com/loganmc10/edge-installer-configs.git'
    targetRevision: main
    ref: values
EOF

In the example above, we are pulling the Helm chart from a publicly available repository, https://loganmc10.github.io/hypershift-helm, and we are pulling the install-config.yaml file from a private repository, https://github.com/loganmc10/edge-installer-configs.git. This Helm chart is not supported by Red Hat, it is just a personal project of mine.

The Helm chart is essentially a collection of YAML templates that use the install-config.yaml file as an input to create the required resources on the management cluster (BareMetalHosts, HostedCluster, NodePool, etc).

Syncing

Once we have created the ArgoCD app, we can log in to the Web UI to look at the status:

The status will show "OutOfSync". Once we click "Sync", ArgoCD will go to work applying the Helm chart on our cluster.

The Helm chart will:

Create a new namespace for our cluster.
Create BareMetalHost, NMStateConfig, and InfraEnv objects, which will be used by ACM/MCE to provision our bare metal hosts and install OpenShift.
Create HostedCluster and NodePool objects. These objects define the configuration for our new hosted cluster.
Configure MetalLB on the management cluster to provide access to the hosted cluster API endpoint.
Scale the NodePool, which will allow the BareMetalHosts to be attached to the new cluster (Agent platform).
Install and configure MetalLB on the hosted cluster workers, to serve the Ingress endpoint.

After about 25-30 minutes, if we check the status of our app again, we will see:

We can check the status of our new hosted cluster via the CLI as well:

[loganmc10@fedoralogan ~]$ oc get hostedcluster,nodepool -n hyper
NAME                                          VERSION   KUBECONFIG               PROGRESS    AVAILABLE   PROGRESSING   MESSAGE
hostedcluster.hypershift.openshift.io/hyper   4.12.2    hyper-admin-kubeconfig   Completed   True        False         The hosted control plane is available

NAME                                     CLUSTER   DESIRED NODES   CURRENT NODES   AUTOSCALING   AUTOREPAIR   VERSION   UPDATINGVERSION   UPDATINGCONFIG   MESSAGE
nodepool.hypershift.openshift.io/hyper   hyper     1               1               False         False        4.12.2

Accessing the cluster

To access the new hosted cluster, we need to get the kubeconfig:

oc get secret -n <cluster-name> <cluster-name>-admin-kubeconfig -o jsonpath='{.data.kubeconfig}' | base64 -d > ~/hosted-kubeconfig

We can then check the status of our new cluster:

[loganmc10@fedoralogan ~]$ export KUBECONFIG=~/hosted-kubeconfig 
[loganmc10@fedoralogan ~]$ oc get co
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
console                                    4.12.2    True        False         False      97s     
csi-snapshot-controller                    4.12.2    True        False         False      24m     
dns                                        4.12.2    True        False         False      2m24s   
image-registry                             4.12.2    True        False         False      2m19s   
ingress                                    4.12.2    True        False         False      23m     
insights                                   4.12.2    True        False         False      3m3s    
kube-apiserver                             4.12.2    True        False         False      24m     
kube-controller-manager                    4.12.2    True        False         False      24m     
kube-scheduler                             4.12.2    True        False         False      24m     
kube-storage-version-migrator              4.12.2    True        False         False      2m33s   
monitoring                                 4.12.2    True        False         False      36s     
network                                    4.12.2    True        False         False      3m14s   
node-tuning                                4.12.2    True        False         False      5m42s   
openshift-apiserver                        4.12.2    True        False         False      24m     
openshift-controller-manager               4.12.2    True        False         False      24m     
openshift-samples                          4.12.2    True        False         False      2m      
operator-lifecycle-manager                 4.12.2    True        False         False      23m     
operator-lifecycle-manager-catalog         4.12.2    True        False         False      23m     
operator-lifecycle-manager-packageserver   4.12.2    True        False         False      24m     
service-ca                                 4.12.2    True        False         False      3m1s    
storage                                    4.12.2    True        False         False      24m     
[loganmc10@fedoralogan ~]$ oc get node
NAME                 STATUS   ROLES    AGE     VERSION
openshift-worker-0   Ready    worker   9m10s   v1.25.4+a34b9e9

Managing the cluster

In the future, if we want to update the cluster to a newer version, all we need to do is update our original install-config.yaml.

hypershift:
  clusterImageSet: quay.io/openshift-release-dev/ocp-release:4.12.3-x86_64

Once we update the Git repository that stores the install-config.yaml file, ArgoCD will notice the change and mark the app as "OutOfSync". Re-syncing the app will cause ArgoCD to update the HostedCluster and NodePool resources, triggering an update to our hosted cluster. This approach will allow us to manage the lifecycle of the cluster the GitOps way.

About the author

Logan McNaughton

Read full bio

Platform products

Try & buy

Featured cloud services

By category

By organization type

By customer

Services

Training & certification

Featured

Topics

Articles

More to explore

For customers

For partners

About us

Open source

Company details

Communities

Recommendations

Select a language

Select a language

Using GitOps to Deploy Bare Metal OpenShift Hosted Control Plane Clusters

Setting up the management cluster

ACM/MCE

MetalLB

GitOps

DNS

Creating an Install Config

Creating the ArgoCD app

Syncing

Accessing the cluster

Managing the cluster

About the author

Logan McNaughton

Products

Tools

Try, buy, & sell

Communicate

About Red Hat

Select a language

Red Hat legal and privacy links

Red Hat legal and privacy links