Overview{%raw %}

Being able to dynamically secure the Ingress controller of a newly provisioned managed OpenShift cluster without administrative login is possible to achieve by leveraging Red Hat Advanced Cluster Management (RHACM) policies in conjunction with Cert Manager running on the hub. In this post we will walk you through the setup and policies required to achieve this.

Assumptions

This article is targeted at readers who have some degree of familiarity with RHACM for cluster provisioning and policy templating as these tools will be extensively used. It is assumed that RHACM has already been installed on a cluster that will function as the primary hub for managing OpenShift clusters. The instructions below were tested with OpenShift 4.10 and RHACM 2.5. Whilst AWS is used as the infrastructure provider, the instructions are expected to be equally applicable across any cloud hyper-scaler with only minor modifications. All operations are intended to be run from the hub as the cluster administrator.

Setting up Cluster Pools

RHACM is able to abstract the nature of the underlying compute, storage and network along with the desired version of OpenShift in a custom resource known as a ClusterPool. When configured in the manner described below, a ClusterPool can serve as a templating engine spawning new clusters in response to a ClusterClaim. For example if the administrator of the hub has pre-configured a ClusterPool according to the specifications and credentials provided by a DevOps team, and is then asked to provision three clusters for different operating environments, then all that is required in the request submission ticket are the labels by which the clusters to be provisioned will subsequently be identified by, in this case prod, stage and dev. This is an important point as clusters provisioned in this manner do not have statically defined names and selecting managed clusters for the purpose of deploying policies and applications to is done via labels, just like how pods and services and other Kubernetes objects are selected.

The following configuration creates a ClusterPool for team blue on AWS in the region ap-southeast-1 using credentials provided by team blue thus ensuring spend on infrastructure can be attributed correctly. The specific machine instance types are encoded in the base64 representation of the install-config.yaml artifact (not shown). If team blue requires clusters with different machine instance types, then this can be catered for by defining additional ClusterPools with those characteristics encoded in the install-config.yaml. All clusters spawned from this pool will be installed with OpenShift 4.10.20. It should be self-explanatory that each team has at least one or more ClusterPools defined in this manner if the concept of sharing clusters between teams is not organizationally practiced.

apiVersion: v1
kind: Namespace
metadata:
name: blue-cluster-pool
spec: {}
---
apiVersion: hive.openshift.io/v1
kind: ClusterPool
metadata:
name: 'blue-cluster-pool-1'
namespace: 'blue-cluster-pool'
labels:
cloud: 'AWS'
region: 'ap-southeast-1'
vendor: OpenShift
cluster.open-cluster-management.io/clusterset: 'blue-cluster-set'
spec:
size: 0
runningCount: 0
skipMachinePools: true
baseDomain: example.com # Base domain name pointing to an AWS Route 53 public hosted zone
installConfigSecretTemplateRef:
name: blue-cluster-pool-1-install-config
imageSetRef:
name: img4.10.20-x86-64-appsub
pullSecretRef:
name: blue-cluster-pool-1-pull-secret
platform:
aws:
credentialsSecretRef:
name: blue-cluster-pool-1-aws-creds
region: ap-southeast-1
---
apiVersion: v1
kind: Secret
metadata:
name: blue-cluster-pool-1-pull-secret
namespace: 'blue-cluster-pool'
data:
.dockerconfigjson: '{{- if eq (lookup "v1" "Secret" "open-cluster-management" "multiclusterhub-operator-pull-secret").kind "Secret" -}} {{- fromSecret "open-cluster-management" "multiclusterhub-operator-pull-secret" ".dockerconfigjson" -}} {{- else -}} {{- fromSecret "openshift-config" "pull-secret" ".dockerconfigjson" -}} {{- end -}}'
type: kubernetes.io/dockerconfigjson
---
apiVersion: v1
kind: Secret
metadata:
name: blue-cluster-pool-1-install-config
namespace: 'blue-cluster-pool'
type: Opaque
data:
install-config.yaml: # Base64 encoding of install-config yaml
---
apiVersion: v1
kind: Secret
type: Opaque
metadata:
name: blue-cluster-pool-1-aws-creds
namespace: 'blue-cluster-pool'
stringData:
aws_access_key_id: '{{ fromSecret "blue-cluster-pool" "aws-creds" "aws_access_key_id" | base64dec }}'
aws_secret_access_key: '{{ fromSecret "blue-cluster-pool" "aws-creds" "aws_secret_access_key" | base64dec }}'
---
apiVersion: cluster.open-cluster-management.io/v1beta1
kind: ManagedClusterSet
metadata:
name: blue-cluster-set

In the above configuration setting the size of the ClusterPool effectively turns it into a templating engine as no clusters will be created until such time a ClusterClaim is submitted. Pull secret and AWS credentials are copied in using RHACM policy template functions that are described in more detail here: https://cloud.redhat.com/blog/applying-policy-based-governance-at-scale-using-templates. Finally the cluster is associated with a ClusterSet which can be used to restrict permissions to specific teams using a combination of RBAC and namespace bindings, thus enabling hub multi-tenancy.

Once the hub administrator has created the ClusterPool which abstracts all of the complexities of the underlying infrastructure and configuration the actual act of provisioning a cluster can be triggered via a much simpler piece of configuration represented by a ClusterClaim custom resource. The following example shows how this would look for team blue if they required prod, stage and dev clusters.

apiVersion: hive.openshift.io/v1
kind: ClusterClaim
metadata:
name: blue-cluster-1
namespace: blue-cluster-pool
labels:
env: prod
spec:
clusterPoolName: blue-cluster-pool-1
---
apiVersion: hive.openshift.io/v1
kind: ClusterClaim
metadata:
name: blue-cluster-2
namespace: blue-cluster-pool
labels:
env: stage
spec:
clusterPoolName: blue-cluster-pool-1
---
apiVersion: hive.openshift.io/v1
kind: ClusterClaim
metadata:
name: blue-cluster-3
namespace: blue-cluster-pool
labels:
env: dev
spec:
clusterPoolName: blue-cluster-pool-1

All three ClusterClaims point to the same ClusterPool as their source of infrastructure origin, and a total of three clusters will be spawned that are labeled prod, stage, and dev respectively. The cluster name will be dynamically generated and will be different from that of the name of the ClusterClaim. This is an important point as this name will need to be mapped in order to generate a X.509 certificate in which the common name is always bound to the fully qualified name of the cluster and base domain.

Setting up Cert Manager

In order to ensure that a newly provisioned cluster is configured with a certificate issued by a private or public CA, it is required to setup Cert Manager on the hub. In this walk-through we will show the integration between RHACM and Cert Manager using the LetsEncrypt Issuer for minting X.509 certificates and patching these into the Ingress controller of the managed cluster.

Begin by installing the cert-manager operator as documented here. Post-installation CertManager needs to be configured to interact with an external name resolver to enable LetsEncrypt to validate the base domain name via the public Internet. In addition the certificate issuer will need to be able to manipulate Route 53 (add TXT resource records) to perform the full DNS validation handshake. The following configuration options achieve this goal:

kind: CertManager
metadata:
name: cluster
spec:
managementState: "Managed"
unsupportedConfigOverrides:
controller:
args:
- "--v=2"
- "--dns01-recursive-nameservers=1.1.1.1:53"
- "--dns01-recursive-nameservers-only"
- "--cluster-resource-namespace=blue-cluster-pool"
- "--leader-election-namespace=kube-system"

Setting up Certificate Issuers

At this point we have a ClusterPool that is able to spawn any number of managed OpenShift clusters in response to a new ClusterClaim resource. These clusters need their default self-signed certificates replaced by a validated certificate containing the dynamically generated cluster name and base domain. These certificates will be generated first on the hub and then propagated to the managed clusters using RHACM policies. To generate the certificates a ClusterIssuer custom resource is configured which identifies LetsEncrypt and AWS endpoints that need to be communicated with in order to achieve the certificate validation.

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: blue-ca-issuer
namespace: blue-cluster-pool
spec:
acme:
solvers:
- dns01:
route53:
region: ap-southeast-1
hostedZoneID: # This must point to a public hosted zone that LetsEncrypt can validate
accessKeyID: '{{ fromSecret "blue-cluster-pool" "aws-creds" "aws_access_key_id" | base64dec }}'
secretAccessKeySecretRef:
key: aws_secret_access_key
name: aws-creds
email: # Administrative email address
server: "https://acme-staging-v02.api.letsencrypt.org/directory"
privateKeySecretRef:
name: blue-ca-issuer

Similar to how we create a ClusterClaim for a ClusterPool we create a Certificate for the ClusterIssuer to operate on. Inside the specification of the certificate we make extensive use of RHACM policy templates to map a ClusterClaim name (a known entity) to a managed cluster name (an unknown entity) and then append our base domain name and prefix a wildcard.

apiVersion: v1
kind: Namespace
metadata:
name: provisioning-policies
spec: {}
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: 'certificate-{{ (lookup "hive.openshift.io/v1" "ClusterClaim" "blue-cluster-pool" "blue-cluster-1").spec.namespace }}'
namespace: provisioning-policies
spec:
commonName: '*.{{ (lookup "hive.openshift.io/v1" "ClusterClaim" "blue-cluster-pool" "blue-cluster-1").spec.namespace }}.example.com'
issuerRef:
name: blue-ca-issuer
kind: ClusterIssuer
secretName: 'tls-{{ (lookup "hive.openshift.io/v1" "ClusterClaim" "blue-cluster-pool" "blue-cluster-1").spec.namespace }}'
dnsNames:
- '*.{{ (lookup "hive.openshift.io/v1" "ClusterClaim" "blue-cluster-pool" "blue-cluster-1").spec.namespace }}.example.com'
- '*.apps.{{ (lookup "hive.openshift.io/v1" "ClusterClaim" "blue-cluster-pool" "blue-cluster-1").spec.namespace }}.example.com'
privateKey:
rotationPolicy: Always

The namespace in which the issued certificate and private key is stored in, is important as later we will use a more restrictive RHACM template function to securely copy this to the managed cluster.

Provisioning via the Policy Generator

We have mentioned policy templates several times and it is here that we wrap all of the above YAML into policy using the Policy Generator tool explained in more detail here: https://cloud.redhat.com/blog/generating-governance-policies-using-kustomize-and-gitops. Assuming all of the above assets are stored in the same directory on your local workstation (e.g., input-openshift-clusters), the following policyGenerator configuration can be used to generate the required policies:

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
name: policy-generator-openshift-provisioning
placementBindingDefaults:
name: binding-policy-generator-openshift-provisioning
policyDefaults:
namespace: provisioning-policies
remediationAction: enforce
complianceType: musthave
severity: medium
policies:
- name: policy-openshift-clusters-provisioning
manifests:
- path: input-openshift-clusters/
policySets:
- openshift-provisioning
policySets:
- name: openshift-provisioning
description: Policies to provision openshift clusters.
placement:
placementPath: input/placement-openshift-clusters-local.yaml

Configuration for the referenced Placement custom resource:

apiVersion: cluster.open-cluster-management.io/v1beta1
kind: Placement
metadata:
name: placement-openshift-clusters-local
namespace: provisioning-policies
spec:
predicates:
- requiredClusterSelector:
labelSelector:
matchExpressions:
- {key: name, operator: In, values: ["local-cluster"]}

Configuration required to allow policies in the provisioning-policies namespace to operate on the hub:

apiVersion: cluster.open-cluster-management.io/v1beta1
kind: ManagedClusterSetBinding
metadata:
name: default
namespace: provisioning-policies
spec:
clusterSet: default

Configuration for the kustomization.yaml to call the policyGenerator:

generators:
- ./policyGenerator.yaml

To invoke the policyGenerator using the kustomize plugin can be done using the following command issued from the current working directory:

kustomize build --enable-alpha-plugins . | oc apply -f -

The policy generator will populate the namespace provisioning-policies with all of the required policies which will then be propagated and evaluated on the target cluster identified by the Placement resource. Given that the placement resource points to the hub (which has a fixed name of local-cluster) that means that all resource configurations that the policies enforce will be created here too. Other controllers on the hub will pick up the new resource configurations and create the corresponding resource objects, i.e., managed OpenShift clusters and certificates.

The next step will be to securely copy the cryptographic material to the managed OpenShift cluster and patch the Ingress controller. This too can be accomplished using RHACM policies and the original policyGenerator.yaml file can be extended to produce two PolicySets for this purpose - one targeting the hub as already described and the other targeting any non-hub managed OpenShift cluster.

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
name: policy-generator-openshift-provisioning
placementBindingDefaults:
name: binding-policy-generator-openshift-provisioning
policyDefaults:
namespace: provisioning-policies
remediationAction: enforce
complianceType: musthave
severity: medium
policies:
- name: policy-openshift-clusters-provisioning
manifests:
- path: input-openshift-clusters/
policySets:
- openshift-provisioning
- name: policy-ingress-config
manifests:
- path: input-ingress/
policySets:
- openshift-clusters
remediationAction: inform
policySets:
- name: openshift-provisioning
description: Policies to provision openshift clusters.
placement:
placementPath: input/placement-openshift-clusters-local.yaml
- name: openshift-clusters
description: Policies to patch ingress on provisioned openshift clusters.
placement:
placementPath: input/placement-openshift-clusters-not-local.yaml

The additional Placement resource contains the following logic to identify all non-hub managed OpenShift clusters:

kind: Placement
metadata:
name: placement-openshift-clusters
namespace: provisioning-policies
predicates:
- requiredClusterSelector:
labelSelector:
matchExpressions:
- {key: name, operator: NotIn, values: ["local-cluster"]}

Also required is an additional namespace binding that will allow policies in the provisioning-policies namespace to propagate through to the clusters in the blue-cluster-set:

apiVersion: cluster.open-cluster-management.io/v1beta1
kind: ManagedClusterSetBinding
metadata:
name: blue-cluster-set
namespace: provisioning-policies
spec:
clusterSet: blue-cluster-set

Configuration to securely copy and patch the Ingress controller is stored the input-ingress directory:

apiVersion: v1
kind: Secret
metadata:
name: custom-certificate
namespace: openshift-ingress
type: kubernetes.io/tls
data:
tls.crt: '{{hub fromSecret "provisioning-policies" (printf "tls-%s" .ManagedClusterName) "tls.crt" hub}}'
tls.key: '{{hub fromSecret "provisioning-policies" (printf "tls-%s" .ManagedClusterName) "tls.key" hub}}'
---
apiVersion: operator.openshift.io/v1
kind: IngressController
metadata:
name: default
namespace: openshift-ingress-operator
spec:
defaultCertificate:
name: custom-certificate

Note the use of the {{hub .. hub}} template function which only sources configuration from the same namespace on the hub containing the policy itself. This prevents other users on the hub from writing policies to exfiltrate secrets by referencing a namespace which they do not have read access to. Also note how the template uses the .ManagedClusterName template context which is automatically set by the policy controller at runtime for the managed cluster for which the policy is being evaluated.

Again, the policyGenerator can be invoked as before using kustomize.

After a few minutes the Ingress controller will be presenting a certificate issued by the LetsEncrypt staging endpoint.

Conclusion

This article has shown how a policy-driven approach using RHACM is able to provision a fleet of managed OpenShift clusters across any cloud hyper-scalar and subsequently configure these with proper certificates without requiring any administrative login or additional third-party software to be installed on the managed cluster itself. Every operation can be done on the hub and capabilities such as ClusterPools enable the hub administrator to manage team-specific infrastructure templates from a single pane of glass thus reducing toil and improving visibility of the estate.