Introduction

In OCP 4.5, Red Hat introduced the compact cluster, an OpenShift cluster consisting of three physical machines that have both the supervisor and worker roles applied. With limited hardware, it is now possible to have a highly available platform that can be used for both traditional virtual machines and containerized workloads.

This type of deployment is perfect for remote locations where:

  • space is limited
  • network connectivity to the edge is slow, unreliable, or not always present
  • high availability and autonomy is required
  • hardware cost has a high impact (in case of 100s of edge deployments)

The compact cluster can be either a connected cluster (with access to the Red Hat container registries) or disconnected (air-gapped). In case of a disconnected install, OpenShift’s container images need to be copied to a mirror registry on premises. While we recommend using Red Hat Quay, any OCI compatible container registry can be used for this.

This mirror registry needs to be available and reachable at all times for multiple reasons:

  • During upgrades of the platform, OpenShift needs to pull new image versions of its containerized components.
  • A pod can be (re)started and scheduled to a different host for various reasons (server failure, a reboot during an upgrade, an extra node is added, health check probe failing, etc.). If the container image is not present on this node, it needs to be pulled from the mirror registry.
  • Image garbage collection can remove a cached container image on a node if it is not in use. If the container image is needed in the future, it will need to be pulled again.

In this blog, we describe a way of deploying edge type clusters that occupy a small footprint (compact configuration) and are disconnected, that is, they have no connectivity to the internet. In addition, the network connection to a central location is not guaranteed, which means we need to have a mirror registry in the edge location.

Considering the space, network and cost constraints, this mirror registry is a challenge. If we add one extra physical machine to host the mirror registry, our hardware cost increases by 33%. This mirror registry also becomes a single point of failure if not deployed in a highly available way.

Deploying a highly available registry (for example, Red Hat Quay) on top of the OpenShift cluster and using that as the mirror registry for the underlying cluster itself could be another approach. This works very well during normal operations, but there are chicken-and-egg scenarios to consider when cold-starting the OpenShift cluster. There might not be three different availability zones in the edge location (something you would typically have in public clouds), so a power outage in the edge would result in the whole cluster being shut off. For some use cases, there might be a need to install the cluster in a central location, turn it off, and move the hardware to the remote location. In both scenarios, it is needed to cold start the cluster. When starting the cluster, if a container of the registry or one of its dependencies is scheduled on a node where the image was not pulled before, it will not be able to start because it cannot pull the needed container image. By consequence, other components of the OpenShift platform might not be able to start because the registry is not running. 

We have worked around this challenge by deploying registries on the RHEL CoreOS nodes using systemd and podman. Since podman is not under OpenShift’s control, this system can be treated as “outside of the cluster” avoiding any chicken-and-egg scenarios. However, it does not require extra hardware and does not have a single point of failure. Systemd makes sure that the container registry starts at boot time and is running before the Kubelet is started. Redundancy is created by installing this registry on multiple RHEL CoreOS nodes and defining multiple registry mirrors in the ImageContentSourcePolicy. The registries will host the container images of OpenShift and its operators. Other container images can be stored in OpenShift’s built-in registry.

As backend storage for this registry, we defined a directory at /var/local-registry. /var persist across reboots and updates of RHEL CoreOS.

In this blog post, we will walk through the steps to set up such a cluster.

Prerequisites

The mirror registries will be hosted on the RHEL CoreOS nodes themselves, so they can only be deployed post-installation. To start, you need a (three node) compact cluster. This could be a connected cluster (container images can be downloaded directly from the Red Hat registries) or a disconnected cluster (installed using a temporary mirror registry). Next, the container images are copied to these mirror registries and an ImageContentSourcePolicy is defined. The network connection to the Red Hat registries (in case of a connected cluster) or the temporary mirror registry (in case of an already disconnected cluster) can be removed at the end.

While you definitely can add more worker nodes to a compact cluster or have different machines for supervisor and worker roles, this is out of scope of this blog post.

To set things up, you need to have a machine with network connectivity to the OpenShift API and RHEL CoreOS nodes. This could be a bare-metal machine, a virtual machine, or simply a laptop connected to the cluster. We will refer to this as the workstation. The following tools need to be installed on the workstation:

The oc cli, opm cli and pull secret can be downloaded from https://console.redhat.com/openshift/downloads.

In the procedure below, we have used variables so it is easier for you to copy and paste. “node1”, “node2” and “node3” represent the IPv4 address of the three RHEL CoreOS nodes, while “nodes” is an array. “quay_user” is an account on quay.io (which you can get for free). “path_to_pull_secret” is the path to the pull secret to access the Red Hat repositories on your workstation:

node1=<ip of node1>
node2=<ip of node2>
node3=<ip of node3>
nodes="$node1 $node2 $node3"
quay_user=<username on quay.io>
path_to_pull_secret=</path/to/pull/secret/on/workstation>

Deployment

Get the container registry

The container registry that we will be using is the CNCF Distribution Registry. As a proof of concept, we used an already existing container image, which can be found here docker://docker.io/library/registry:2.

Note that the CNCF Distribution registry is not shipped nor supported by Red Hat. 

Optionally: To avoid hitting the rate limits of Docker Hub, we copied the registry container image to quay.io and used that one instead. Create a free account on Quay.io with a public repository named “registry”:

$ podman login -u $quay_user quay.io
$ skopeo copy docker://docker.io/library/registry:2 docker://quay.io/${quay_user}/registry:2

Use Skopeo to get the digest of the container

$ sha=$(skopeo inspect docker://quay.io/${quay_user}/registry:2 --format "{{ .Digest }}")
$ echo $sha
sha256:b0b8dd398630cbb819d9a9c2fbd50561370856874b5d5d935be2e0af07c0ff4c

Using a systemd unit file, we can guarantee that this container starts upon boot of RHEL Coreos and that this is started before the Kubelet or CRIO. We will use a MachineConfig Custom Resource (and the Machine Config Operator) to define it on the nodes:

cat <<EOF >> machine_config.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
  machineconfiguration.openshift.io/role: master
name: 99-master-local-registry
spec:
config:
  ignition:
    config: {}
    security:
      tls: {}
    timeouts: {}
    version: 3.1.0
  networkd: {}
  passwd: {}
  systemd:
    units:
      - name: container-registry.service
        enabled: true
        contents: |
          [Unit]
          Description=Local OpenShift Container Registry
          Wants=network.target
          After=network-online.target
          [Service]
          Environment=PODMAN_SYSTEMD_UNIT=%n
          Restart=on-failure
          TimeoutStopSec=70
          ExecStartPre=/usr/bin/mkdir -p /var/local-registry
          ExecStartPre=/bin/rm -f %t/container-registry.pid %t/container-registry.ctr-id
          ExecStart=/usr/bin/podman run --conmon-pidfile %t/container-registry.pid --cidfile %t/container-registry.ctr-id --cgroups=no-conmon --replace -d --net=host --name registry -v /var/local-registry:/var/lib/registry:z quay.io/$quay_user/registry@$sha
          ExecStop=/usr/bin/podman stop --ignore --cidfile %t/container-registry.ctr-id -t 10
          ExecStopPost=/usr/bin/podman rm --ignore -f --cidfile %t/container-registry.ctr-id
          PIDFile=%t/container-registry.pid
          Type=forking
          [Install]
          WantedBy=multi-user.target default.target
EOF

oc apply -f machine_config.yaml

 

The container image is referenced by its digest, since this is a requirement when using repository mirrors. Also note that the directory /var/local-registry (arbitrarily chosen) is automatically created if it does not exist. The contents of /var are preserved during RHEL CoreOS updates.

The Machine Config Operator will apply this configuration to each node and gracefully reboot them one at a time. You can monitor this process:

$ watch oc get nodes

NAME                           STATUS                     ROLES           AGE   VERSION
<node1>                        Ready                      master,worker   44h   v1.22.0-rc.0+75ee307
<node2>                        Ready                      master,worker   44h   v1.22.0-rc.0+75ee307
<node3>                        NotReady,SchedulingDisabled master,worker   44h   v1.22.0-rc.0+75ee307

After a while, you can check that the service is running on each node:

$ oc debug node/<node-name>
sh-4.4# chroot /host
sh-4.4# systemctl is-active container-registry
active

Define the registries within OpenShift

In an ideal scenario, the registries are using TLS. In our setup, this is not the case, so we will define these registries as insecure. This needs to be done both on your workstation as well in OpenShift.

To define these registries as insecure on your workstation, edit /etc/containers/registries.conf to set the following:

[registries.insecure]
registries = ['<node1>','<node2>', '<node3>']

The above lines are version 1 syntax of the containers-registries configuration file and might be different for newer versions of the container tools. On our RHEL8 workstation, we have been using the container-tools:3.0 yum module.

To set it in OpenShift, create the following custom resource:

cat <<EOF >> insecure.yaml
apiVersion: config.openshift.io/v1
kind: Image
metadata:
name: cluster
spec:
registrySources:
  insecureRegistries:
  - localhost:5000
  - $node1:5000
  - $node2:5000
  - $node3:5000
EOF

oc apply -f insecure.yaml

Note that we defined “localhost:5000” as well. The reasoning behind this is that we will define multiple mirrors for redundancy with “localhost:5000” being the first one to try. Since a container registry is running locally on each node, this avoids all container image pulls falling on one node. In case of clusters with extra worker nodes (without a local registry), you might want to leave this entry out.

Again you can use “oc get nodes” to follow the progress and check the applied configuration:

$ oc debug node/<node>
sh-4.4# chroot /host
sh-4.4# cat /etc/containers/registries.conf
...
[[registry]]
prefix = ""
location = "localhost:5000"
insecure = true
...

Copy the container images of OpenShift

There are three repositories defined in the registries: “ocp” contains the images of the OpenShift components, “registry” contains the image of the Docker Distribution container registry itself and “olm-mirror” contains all the content of the OpenShift operators. After transferring the needed container images, you will need to create ImageContentSourcePolicy custom resources. An ImageContentSourcePolicy holds cluster-wide information about which mirrors to try for a particular container image.

We will first copy the OpenShift container images to the newly created registries. Execute the following commands for all three registries:

OCP_RELEASE=<OCP version, example 4.9.0>
LOCAL_REGISTRY='<IP OF REGISTRY>:5000'
LOCAL_REPOSITORY=ocp4/openshift
PRODUCT_REPO='openshift-release-dev'
LOCAL_SECRET_JSON=$path_to_pull_secret
RELEASE_NAME="ocp-release"
ARCHITECTURE=x86_64

oc adm release mirror -a ${LOCAL_SECRET_JSON} --from=quay.io/${PRODUCT_REPO}/${RELEASE_NAME}:${OCP_RELEASE}-${ARCHITECTURE} --to=${LOCAL_REGISTRY}/${LOCAL_REPOSITORY} --to-release-image=${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}:${OCP_RELEASE}-${ARCHITECTURE} --insecure=true

The above command will only work if your workstation is connected to the cluster as well as the Red Hat registries. In case this is not possible, copy the content to removable media and upload it to the cluster’s registries as described here.

The mirror command prints out an example of an ImageContentSourcePolicy that can be used. Let’s check the ImageContentSourcePolicy shown after copying the content to the first node:

cat <<EOF >> icsp.yaml
apiVersion: operator.openshift.io/v1alpha1
kind: ImageContentSourcePolicy
metadata:
name: example
spec:
repositoryDigestMirrors:
- mirrors:
  - $node1:5000/ocp4/openshift
  source: quay.io/openshift-release-dev/ocp-release
- mirrors:
  - $node1:5000/ocp4/openshift
  source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
EOF

You can define multiple mirrors in an ImageContentSourcePolicy. When a node makes a request for an image from the source repository, it tries each mirrored repository in turn until it finds the requested content. If all mirrors fail, the cluster tries the original source repository. Upon success, the image is pulled to the node.

To get redundancy, we will define multiple mirrors, with the first one being “localhost:5000”. Let’s use awk to generate a new file based on the previous output:

awk '{if (!/'$node1'/) {print ; next} ;  print gensub(/'$node1'/, "localhost", "1") ; print;  print gensub(/'$node1'/, "'$node2'", "1") ; print gensub(/'$node1'/, "'$node3'", "1") } ' icsp.yaml >> icsp_new.yaml

icsp_new.yaml looks as follows:

apiVersion: operator.openshift.io/v1alpha1
kind: ImageContentSourcePolicy
metadata:
name: example
spec:
repositoryDigestMirrors:
- mirrors:
  - localhost:5000/ocp4/openshift
  - 10.10.21.174:5000/ocp4/openshift
  - 10.10.21.175:5000/ocp4/openshift
  - 10.10.21.176:5000/ocp4/openshift
  source: quay.io/openshift-release-dev/ocp-release
- mirrors:
  - localhost:5000/ocp4/openshift
  - 10.10.21.174:5000/ocp4/openshift
  - 10.10.21.175:5000/ocp4/openshift
  - 10.10.21.176:5000/ocp4/openshift

Let’s create the new ImageContentSourcePolicy:

oc create -f icsp_new.yaml

Copy the container image of the CNCF Distribution registry

We also need to copy the container image of the registry itself. This could be needed when a failed node is replaced. The same repository can also be used to upload new versions of the CNCF Distribution registry.

for node in $nodes;
do
skopeo copy docker://quay.io/$quay_user/registry:2 docker://$node:5000/registry:2
done

For this we need to create an ImageContentSourcePolicy as well:

cat <<EOF >> icsp_registry.yaml
apiVersion: operator.openshift.io/v1alpha1
kind: ImageContentSourcePolicy
metadata:
name: registry
spec:
repositoryDigestMirrors:
- mirrors:
  - localhost:5000/registry
  - $node1:5000/registry
  - $node2:5000/registry
  - $node3:5000/registry
  source: quay.io/$quay_user/registry
EOF

oc create -f icsp_registry.yaml

Set up the Operator Lifecycle Manager

To copy the containers needed for the OpenShift operators, we will follow the official documentation.

If not already done, disable the sources for the default catalog:

oc patch OperatorHub cluster --type json \
    -p '[{"op": "add", "path": "/spec/disableAllDefaultSources", "value": true}]'

On your workstation, prune an index image. You need to define a list of operators you want to copy. For testing, we only used the local-storage-operator, but you will likely need other operators too. As registry name, we used an arbitrary name, called “dummyregistry”:

podman login registry.redhat.io
opm index prune \
  -f registry.redhat.io/redhat/redhat-operator-index:v4.9 \
  -p local-storage-operator \
   -t dummyregistry.io:5000/olm-mirror/redhat-operator-index:v4.9

Copy this image to all the registries.

for node in $nodes;
do
podman push dummyregistry.io:5000/olm-mirror/redhat-operator-index:v4.9 \
$node:5000/olm-mirror/redhat-operator-index:v4.9
done

Next, we willl need to get all the content and push it to the registry.

for node in $nodes;
do
oc adm catalog mirror $node:5000/olm-mirror/redhat-operator-index:v4.9 \
$node:5000/olm-mirror --insecure --index-filter-by-os='linux/amd64' \
-a $path_to_pull_secret
done

After executing the mirror command, a subfolder is created with an ImageContentSource and a CatalogSource. To know more about the CatalogSource custom resource, check this page.

For example, the ImageContentSourcePolicy.yaml for the first node looks like this:

apiVersion: operator.openshift.io/v1alpha1
kind: ImageContentSourcePolicy
metadata:
labels:
  operators.openshift.org/catalog: "true"
name: redhat-operator-index-0
spec:
repositoryDigestMirrors:
- mirrors:
  - $node1:5000/olm-mirror/openshift4-ose-local-storage-operator-bundle
  source: registry.redhat.io/openshift4/ose-local-storage-operator-bundle
- mirrors:
  - $node1:5000/olm-mirror/openshift4-ose-local-storage-operator
  source: registry.redhat.io/openshift4/ose-local-storage-operator
- mirrors:
  - $node1:5000/olm-mirror/openshift4-ose-local-storage-diskmaker
  source: registry.redhat.io/openshift4/ose-local-storage-diskmaker
- mirrors:
  - $node1:5000/olm-mirror/openshift4-ose-local-storage-static-provisioner
  source: registry.redhat.io/openshift4/ose-local-storage-static-provisioner

This file obviously depends on the operators you defined in the prune command. Again, we need to duplicate lines for each mirror registry, with “localhost:5000” being the first one. We can use the same awk command for this:

awk '{if (!/'$node1'/) {print ; next} ;  print gensub(/'$node1'/, "localhost", "1") ; print; print gensub(/'$node1'/, "'$node2'", "1") ; print gensub(/'$node1'/, "'$node3'", "1") } ' imageContentSourcePolicy.yaml >> imageContentSourcePolicy_new.yaml
oc apply -f imageContentSourcePolicy_new.yaml

A catalogSource.yaml is created as well:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: redhat-operator-index
namespace: openshift-marketplace
spec:
Image: $node1:5000/olm-mirror/olm-mirror-redhat-operator-index:v4.9
sourceType: grpc

We will need to edit this file to point to our dummyregistry and use a sha256 hash instead:

sha=$(skopeo inspect docker://$node1:5000/olm-mirror/redhat-operator-index:v4.9 --format="{{ .Digest }}")

cat <<EOF >> catalogsource.yaml

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: redhat-operator-index
namespace: openshift-marketplace
spec:
image: dummyregistry.io:5000/olm-mirror/redhat-operator-index@$sha
sourceType: grpc
publisher: my_org
updateStrategy:
  registryPoll:
    interval: 30m
EOF

cat <<EOF >> icsp_dummyregistry.yaml
apiVersion: operator.openshift.io/v1alpha1
kind: ImageContentSourcePolicy
metadata:
labels:
  operators.openshift.org/catalog: "true"
name: dummyregistry
spec:
repositoryDigestMirrors:
- mirrors:
  - localhost:5000/olm-mirror/redhat-operator-index
  - $node1:5000/olm-mirror/redhat-operator-index
  - $node2:5000/olm-mirror/redhat-operator-index
  - $node3:5000/olm-mirror/redhat-operator-index
  source: dummyregistry.io:5000/olm-mirror/redhat-operator-index
EOF

oc apply -f icsp_dummyregistry.yaml

oc apply -f catalogsource.yaml

That is it. You can now disconnect the clusters or remove the temporary registry you used during the OpenShift installation.

In case of updates, you can copy the new container content and create ImageContentSourcePolicies in a similar way. For more information about updates in disconnected environments, please refer to these docs.

To consider

The containers that are created using systemd are not managed by OpenShift, and as a consequence, the allocation of resources to these containers are unknown to the OpenShift platform. You have to account for what resources the container registry needs and ensure they do not conflict or 'starve' the resources of other system components. For more information on how to configure this, see this page.

It is also possible (depending on the configuration) that the deployment of the registry confuses OpenShift into thinking it is using more resources than it is. For example, if the container registry places content in / or /var, then OCP may see this space fill up and try to evict pods to free up space.

Other things to consider could be the usage of HTTPS/certificates and/or pull secrets to connect to the registries.