Subscribe to our blog

Operational resilience is becoming more of a boardroom concern especially so for organizations operating in industries deemed as essential by governements for the functioning of society. Organizations operating IT systems that underpin Critical National Infrastructure (CNI) are required to implement capabilites to protect data and processes against hostile actions, human error and natural disasters.

Often when organizations embark on a cloud journey they think of resilience within the context of a single cloud platform. But as events have born out over time a single cloud platform can also experience failures for which the cause is only understood as part of a post mortem analysis, but typically involves some sort of cascading failure scenario in which existing coping mechanisms and resources are overwhelmed due to multiple classes of events occurring in short order of each other. To protect against cloud platform failure it is imperative that organizations consider these scenarios more carefully and implement appropriate measures as part of an enhanced business continuity plan to address CNI mandates.

In this blog we will look at how the toolset available with OpenShift Platform Plus can help businesses orchestrate operational resilience and protect themselves against cloud platform failure. The specific area of focus is on stateful applications that manage data which is "sticky" by nature and is more difficult to move between cloud platforms given the propriety nature of cloud platform APIs. Stateless applications in comparison are simpler to deal with and can be made resilient by deploying the application across multiple cloud platforms fronted by a global load balancer that itself is decoupled from any of the cloud platforms.

Note that the techniques described below could also be applied to facilitate a migration of stateful applications across cloud platforms.

Architecture

In order to decouple a stateful application from the storage infrastructure exposed by the cloud platform we will leverge Red Hat OpenShift Data Foundation (RHODF) which presents a cloud-agnostic Container Storage Interface (CSI) across all cloud platforms as well as on premises for block, file and object storage types. In our solution architecture we will leverage block storage (based on Ceph RBD) and object storage (based on NooBaa) to demonstrate operational resilience capabilities across cloud platforms from AWS and GCP.

Other key components of the solution architecture are the Policy orchestration engine included with Red Hat Advanced Cluster Management (RHACM) which will manage all data movement operations performed by the OpenShift API for Data Protection (OADP) Operator. The following diagram captures the overall workflow.

Cluster Landing Zone - Migration

Red lines on the diagram indicate flow of control based on Kubernetes resource manifests being downloaded from Git repositories and processed by the Policy orchestration engine. A PolicyGenerator kustomize plugin that is loaded when OpenShift GitOps (ArgoCD) starts transforms these resource manifests into policy documents which is indicated by steps 1 and 2. For more details on the PolicyGenerator kustomize plugin please refer to the documentation. The policy documents are responsible not only for configuring RHODF and OADP Operators but also for scheduling backup and restore workflows underpinning all data movement operations between the managed clusters which is indicated by steps 3 and 4.

Blue lines on the diagram indicate flow of data in response to Policies being enforced. Step 3a occurs whenever a Backup resource is scheduled and involves snapshotting of data in the Ceph RBD volume and uploading this to a hybrid cloud bucket accessible via the Multicloud Object Gateway which is indicated by step 3b. Similarly when a failover is required (or a backup validation test needs to be performed) a Restore resource is submitted which results in data to be downloaded from MCG and written to a Ceph RBD volume which is indicated by steps 4a and 4b.

Note that in this blog we will be protecting a stateful application that has no built-in data replication capabilities and relies on the underlying platform for this. For an example of protecting an application that has built-in data replication capabilities across cloud platforms please refer to this blog which leverages Submariner that is included with RHACM.

Prerequisites

  • One hub cluster located on premises with OpenShift 4.13, RHODF 4.13, RHACM 2.8, and OpenShift GitOps 1.9 installed.
  • One managed cluster located on AWS with OpenShift 4.13, RHODF 4.13, OADP 1.2, and VolSync 0.7 installed.

  • One managed cluster located on GCP with OpenShift 4.13, RHODF 4.13, OADP 1.2, and VolSync 0.7 installed.

Configuration on the Hub Cluster

As a first step we need to create two uniquely named object storage buckets located in AWS and GCP. These will then be "fused" into a hybrid object bucket to enable seamless cross-cloud data transfers and thereby protect the data stored from a cloud platform failure.

Create a bucket in AWS via the CLI:

REGION=<AWS REGION>
aws s3api create-bucket --bucket `uuid` \
--create-bucket-configuration LocationConstraint=$REGION \
--region $REGION

Create a bucket in GCP via the CLI:

REGION=<GCP REGION>
gsutil mb -l $REGION gs://`uuid`

The object buckets must be registered with the hub cluster and configured into a hybrid object bucket class which mirrors the data between the underlying object buckets. The YAML manifests for this is presented below followed by the PolicyGenerator configuration file. The first two manifests create the standalone Multicloud Object Gateway on the hub cluster.

apiVersion: odf.openshift.io/v1alpha1
kind: StorageSystem
metadata:
name: ocs-storagecluster-storagesystem
namespace: openshift-storage
spec:
kind: storagecluster.ocs.openshift.io/v1
name: ocs-storagecluster
namespace: openshift-storage
---
apiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
annotations:
uninstall.ocs.openshift.io/cleanup-policy: delete
uninstall.ocs.openshift.io/mode: graceful
name: ocs-storagecluster
namespace: openshift-storage
spec:
multiCloudGateway:
dbStorageClassName: thin-csi
reconcileStrategy: standalone
---
apiVersion: v1
kind: Secret
metadata:
name: aws-creds
namespace: openshift-storage
type: Opaque
data:
AWS_ACCESS_KEY_ID: <AWS ACCESS KEY ID ENCODED IN BASE64>
AWS_SECRET_ACCESS_KEY: <AWS SECRET ACCESS KEY ENCODED IN BASE64>
---
apiVersion: v1
kind: Secret
metadata:
name: gcp-creds
namespace: openshift-storage
type: Opaque
data:
GoogleServiceAccountPrivateKeyJson: <GCP PRIVATE KEY ENCODED IN BASE64>
---
apiVersion: noobaa.io/v1alpha1
kind: BackingStore
metadata:
labels:
app: noobaa
name: noobaa-aws-backing-store
namespace: openshift-storage
spec:
awsS3:
region: <AWS REGION>
secret:
name: aws-creds
namespace: openshift-storage
targetBucket: <AWS BUCKET NAME>
type: aws-s3
---
apiVersion: noobaa.io/v1alpha1
kind: BackingStore
metadata:
labels:
app: noobaa
name: noobaa-gcp-backing-store
namespace: openshift-storage
spec:
googleCloudStorage:
secret:
name: gcp-creds
namespace: openshift-storage
targetBucket: <GCP BUCKET NAME>
type: google-cloud-storage
---
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
labels:
app: noobaa
name: noobaa-mirror-bucket-class
namespace: openshift-storage
spec:
placementPolicy:
tiers:
- backingStores:
- noobaa-aws-backing-store
- noobaa-gcp-backing-store
placement: Mirror
---
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
name: migration-datastore
namespace: openshift-storage
spec:
generateBucketName: migration-datastore-bucket
storageClassName: openshift-storage.noobaa.io
additionalConfig:
bucketclass: noobaa-mirror-bucket-class

The hybrid object bucket may take a few minutes to become available and it is important to not proceed further until it is ready. A blocking wait can be achieved by using Policy dependencies in the PolicyGenerator configuration file.

apiVersion: noobaa.io/v1alpha1
kind: BackingStore
metadata:
name: noobaa-aws-backing-store
namespace: openshift-storage
status:
phase: Ready
---
apiVersion: noobaa.io/v1alpha1
kind: BackingStore
metadata:
name: noobaa-gcp-backing-store
namespace: openshift-storage
status:
phase: Ready
---
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
name: noobaa-mirror-bucket-class
namespace: openshift-storage
status:
phase: Ready
---
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
name: migration-datastore
namespace: openshift-storage
status:
phase: Bound

Once the hybrid object bucket is available, information about it must be transferred to each of the managed clusters which will access this bucket remotely. This information includes the name of the object bucket, an object service endpoint and credentials for accessing the bucket securely. This information must be staged in the policies namespace first from where it can be securely downloaded by the managed clusters.

apiVersion: v1
kind: ConfigMap
metadata:
name: migration-datastore
namespace: policies
data:
s3Url: '{{ (lookup "route.openshift.io/v1" "Route" "openshift-storage" "s3").spec.host }}'
bucketName: '{{ (lookup "objectbucket.io/v1alpha1" "ObjectBucket" "" "obc-openshift-storage-migration-datastore").spec.endpoint.bucketName }}'
---
apiVersion: v1
kind: Secret
metadata:
name: migration-datastore
namespace: policies
stringData:
cloud: |
[default]
aws_access_key_id={{ fromSecret "openshift-storage" "migration-datastore" "AWS_ACCESS_KEY_ID" | base64dec }}
aws_secret_access_key={{ fromSecret "openshift-storage" "migration-datastore" "AWS_SECRET_ACCESS_KEY" | base64dec }}
type: Opaque

The PolicyGenerator configuration file brings together all of the above and controls the execution workflow using Policy dependencies and remediation actions. Note that directories are used to segregate the three sets of YAML manifests so that they can be managed as separate Policies.

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
name: multicloudgateway
placementBindingDefaults:
name: multicloudgateway
policyDefaults:
namespace: policies
complianceType: musthave
remediationAction: enforce
policySets:
- multicloudgateway
policies:
- name: multicloudgateway-config
manifests:
- path: <DIRECTORY TO MANIFEST FILES>
- name: multicloudgateway-status
remediationAction: inform
manifests:
- path: <DIRECTORY TO MANIFEST FILES>
- name: multicloudgateway-config-copy
dependencies:
- name: multicloudgateway-status
manifests:
- path: <DIRECTORY TO MANIFEST FILES>
policySets:
- name: multicloudgateway
placement:
placementName: multicloudgateway

The following Placement resource referenced in the PolicyGenerator file ensures that this set of Policies will be evaluated on the hub cluster only.

apiVersion: cluster.open-cluster-management.io/v1beta1
kind: Placement
metadata:
name: multicloudgateway
namespace: policies
spec:
predicates:
- requiredClusterSelector:
labelSelector:
matchExpressions:
- {key: name, operator: In, values: ["local-cluster"]}

Configuration on the Managed Clusters

The next set of Policies are to be evaluated on managed clusters in AWS and GCP. These will deploy stateful application using data volumes created from Ceph RBD Storage Class, which abstract the underlying cloud platform storage and presents a consistent CSI across all cloud platforms. Note the use of Policy template functions to map native cloud platform storage classes to a cloud-agnostic storage class.

apiVersion: odf.openshift.io/v1alpha1
kind: StorageSystem
metadata:
name: ocs-storagecluster-storagesystem
namespace: openshift-storage
spec:
kind: storagecluster.ocs.openshift.io/v1
name: ocs-storagecluster
namespace: openshift-storage
---
apiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
annotations:
uninstall.ocs.openshift.io/cleanup-policy: delete
uninstall.ocs.openshift.io/mode: graceful
name: ocs-storagecluster
namespace: openshift-storage
spec:
arbiter: {}
encryption:
kms: {}
externalStorage: {}
managedResources:
cephBlockPools: {}
cephCluster: {}
cephConfig: {}
cephDashboard: {}
cephFilesystems: {}
cephObjectStoreUsers: {}
cephObjectStores: {}
cephToolbox: {}
mirroring: {}
nodeTopologies: {}
storageDeviceSets:
- config: {}
count: 1
dataPVCTemplate:
metadata: {}
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 0.5Ti
storageClassName: '{{- if eq "AWS" (fromClusterClaim "platform.open-cluster-management.io") -}} gp3-csi {{- else if eq "GCP" (fromClusterClaim "platform.open-cluster-management.io") -}} standard-csi {{- end }}'
volumeMode: Block
status: {}
name: 'ocs-deviceset-{{- if eq "AWS" (fromClusterClaim "platform.open-cluster-management.io") -}} gp3-csi {{- else if eq "GCP" (fromClusterClaim "platform.open-cluster-management.io") -}} standard-csi {{- end }}'
placement: {}
portable: true
preparePlacement: {}
replica: 3
resources: {}
---
apiVersion: snapshot.storage.k8s.io/v1
deletionPolicy: Retain
driver: openshift-storage.rbd.csi.ceph.com
kind: VolumeSnapshotClass
metadata:
labels:
velero.io/csi-volumesnapshot-class: "true"
name: ocs-storagecluster-rbdplugin-snapclass

Similar to above, a blocking wait is introduced into the execution workflow to check on the readiness of storage resources before proceeding further.

apiVersion: apps/v1
kind: Deployment
metadata:
name: ocs-operator
namespace: openshift-storage
status:
conditions:
- status: "True"
type: Available
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: odf-operator-controller-manager
namespace: openshift-storage
status:
conditions:
- status: "True"
type: Available
---
apiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
name: ocs-storagecluster
namespace: openshift-storage
status:
phase: Ready

The next set of YAML configures OADP data movers on each managed cluster to perform the heavy lifting of data from user application volumes to the remote hybrid cloud bucket. Note that OADP version 1.2 is still in Tech Preview and requires VolSync to scrape the data off of the volumes. For an overview of this technology please refer to this blog.

apiVersion: v1
kind: Secret
metadata:
name: dm-restic-secret
namespace: openshift-adp
type: Opaque
data:
RESTIC_PASSWORD: <PRIVATE KEY ENCODED IN BASE64>
---
apiVersion: v1
kind: Secret
metadata:
name: cloud-credentials
namespace: openshift-adp
type: Opaque
data: '{{hub copySecretData "policies" "migration-datastore" hub}}'
---
apiVersion: oadp.openshift.io/v1alpha1
kind: DataProtectionApplication
metadata:
name: velero-dpa
namespace: openshift-adp
spec:
features:
dataMover:
credentialName: dm-restic-secret
enable: true
configuration:
velero:
defaultPlugins:
- openshift
- aws
- csi
- vsm
restic:
enable: true
backupLocations:
- velero:
config:
profile: default
region: noobaa
s3Url: 'https://{{hub fromConfigMap "policies" "migration-datastore" "s3Url" hub}}'
s3ForcePathStyle: "true"
insecureSkipTLSVerify: "true"
provider: aws
default: true
credential:
key: cloud
name: cloud-credentials
objectStorage:
bucket: '{{hub fromConfigMap "policies" "migration-datastore" "bucketName" hub}}'
prefix: velero

This set of Policies transfers information about the hybrid object bucket previously staged on the hub cluster to each of the managed clusters by using {{hub .. hub}} delimiters which performs a secure data transfer. For more details about this mechanism please refer to the documentation.

Once the Data Protection Application has been processed by the owning controller it generates a Backup Storage Location resource which in our case points to the Multicloud Object Gateway and for which availability can be validated via Policy.

apiVersion: velero.io/v1
kind: BackupStorageLocation
metadata:
name: velero-dpa-1
namespace: openshift-adp
status:
phase: Available

Another PolicyGenerator configuration file brings together all of the above and controls the execution workflow using Policy dependencies and remediation actions. Note that directories are used to segregate the four sets of YAML manifests so that they can be managed as separate Policies.

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
name: dataprotectionapplication
placementBindingDefaults:
name: dataprotectionapplication
policyDefaults:
namespace: policies
complianceType: musthave
remediationAction: enforce
policySets:
- dataprotectionapplication
policies:
- name: storagecluster-config
manifests:
- path: <DIRECTORY TO MANIFEST FILES>
- name: storagecluster-status
remediationAction: inform
manifests:
- path: <DIRECTORY TO MANIFEST FILES>
- name: dataprotectionapplication-config
dependencies:
- name: storagecluster-status
manifests:
- path: <DIRECTORY TO MANIFEST FILES>
- name: dataprotectionapplication-status
remediationAction: inform
manifests:
- path: <DIRECTORY TO MANIFEST FILES>
policySets:
- name: dataprotectionapplication
placement:
placementName: dataprotectionapplication

The following Placement resource referenced in the PolicyGenerator file ensures that this set of Policies will be evaluated on the managed clusters only.

apiVersion: cluster.open-cluster-management.io/v1beta1
kind: Placement
metadata:
name: dataprotectionapplication
namespace: policies
spec:
predicates:
- requiredClusterSelector:
labelSelector:
matchExpressions:
- {key: name, operator: NotIn, values: ["local-cluster"]}

Finally, our stateful application (Hello OpenShift!) needs to be deployed to the OpenShift cluster running in AWS. This will write a datestamp record in to a filesystem mounted on the data volume generated by Ceph RBD Storage Class.

apiVersion: v1
kind: Namespace
metadata:
name: hello-openshift
spec: {}
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-pvc
namespace: hello-openshift
labels:
app: hello-openshift
spec:
storageClassName: ocs-storagecluster-ceph-rbd
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-openshift
namespace: hello-openshift
spec:
replicas: 1
selector:
matchLabels:
app: hello-openshift
template:
metadata:
labels:
app: hello-openshift
spec:
containers:
- name: hello-openshift
image: registry.access.redhat.com/ubi8/ubi
command: ["sh", "-c"]
args: ["echo $(date) Hello OpenShift! >> /data/hello-openshift.txt && sleep inf"]
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
persistentVolumeClaim:
claimName: data-pvc

Another PolicyGenerator configuration file is used to deploy the application (alternatively consider using OpenShift GitOps ApplicationSets).

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
name: hello-openshift
placementBindingDefaults:
name: hello-openshift
policyDefaults:
namespace: policies
complianceType: musthave
remediationAction: enforce
policySets:
- hello-openshift
policies:
- name: hello-openshift-deploy
manifests:
- path: <DIRECTORY TO MANIFEST FILES>
policySets:
- name: hello-openshift
placement:
placementName: hello-openshift

The following Placement resource referenced in the PolicyGenerator file ensures that this set of Policies will be evaluated on a managed cluster on AWS only.

apiVersion: cluster.open-cluster-management.io/v1beta1
kind: Placement
metadata:
name: hello-openshift
namespace: policies
spec:
predicates:
- requiredClusterSelector:
labelSelector:
matchExpressions:
- {key: name, operator: NotIn, values: ["local-cluster"]}
claimSelector:
matchExpressions:
- {key: platform.open-cluster-management.io, operator: In, values: ["AWS"]}

Login to the managed cluster and confirm the data written to the cloud-agnostic volume created by Ceph RBD Storage Class.

$ oc -n hello-openshift get pod,pvc
NAME READY STATUS RESTARTS AGE
pod/hello-openshift-c86f7f48b-bb2sj 1/1 Running 0 2m3s

NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/data-pvc Bound pvc-5a127dbf-06db-402a-ac70-cf196b54799d 1Gi RWO ocs-storagecluster-ceph-rbd 2m3s

$ oc -n hello-openshift rsh hello-openshift-c86f7f48b-qt76p cat /data/hello-openshift.txt
Mon Sep 4 03:36:05 UTC 2023 Hello OpenShift!

The following Policy will trigger a backup of the namespace in which the application has been deployed. Given the above configuration that has now been put in place, this will result in OADP data movers uploading the data to the hybrid object bucket which in turn will mirror the data across both AWS and GCP.

apiVersion: velero.io/v1
kind: Backup
metadata:
name: hello-openshift
labels:
velero.io/storage-location: default
namespace: openshift-adp
spec:
hooks: {}
includedNamespaces:
- hello-openshift
storageLocation: velero-dpa-1
ttl: 720h0m0s

The following YAML will validate the success of the backup which will take a few moments to complete.

apiVersion: velero.io/v1
kind: Backup
metadata:
name: hello-openshift
namespace: openshift-adp
status:
phase: Completed

The following PolicyGenerator configuration file is used to manage the backup workflow.

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
name: backup
placementBindingDefaults:
name: backup
policyDefaults:
namespace: policies
complianceType: musthave
remediationAction: enforce
policySets:
- backup
policies:
- name: backup-config
manifests:
- path: <DIRECTORY TO MANIFEST FILES>
- name: backup-status
remediationAction: inform
manifests:
- path: <DIRECTORY TO MANIFEST FILES>
policySets:
- name: backup
placement:
placementName: backup

The following Placement resource referenced in the PolicyGenerator file ensures that this set of Policies will be evaluated on a managed cluster in AWS only.

apiVersion: cluster.open-cluster-management.io/v1beta1
kind: Placement
metadata:
name: backup
namespace: policies
spec:
predicates:
- requiredClusterSelector:
labelSelector:
matchExpressions:
- {key: name, operator: NotIn, values: ["local-cluster"]}
claimSelector:
matchExpressions:
- {key: platform.open-cluster-management.io, operator: In, values: ["AWS"]}

To restore the data on a managed cluster running in GCP the following Policy is used.

apiVersion: velero.io/v1
kind: Restore
metadata:
name: hello-openshift
namespace: openshift-adp
spec:
backupName: hello-openshift
excludedResources:
- nodes
- events
- events.events.k8s.io
- backups.velero.io
- restores.velero.io
- resticrepositories.velero.io
restorePVs: true

Similar to validation of the backup, the outcome of the restore operation can be validated via Policy too.

apiVersion: velero.io/v1
kind: Restore
metadata:
name: hello-openshift
namespace: openshift-adp
status:
phase: Completed

The following PolicyGenerator configuration file is used to manage the restore workflow.

apiVersion: policy.open-cluster-management.io/v1
kind: PolicyGenerator
metadata:
name: restore
placementBindingDefaults:
name: restore
policyDefaults:
namespace: policies
complianceType: musthave
remediationAction: enforce
policySets:
- restore
policies:
- name: restore-config
manifests:
- path: <DIRECTORY TO MANIFEST FILES>
- name: restore-status
remediationAction: inform
manifests:
- path: <DIRECTORY TO MANIFEST FILES>
policySets:
- name: restore
placement:
placementName: restore

The following Placement resource referenced in the PolicyGenerator file ensures that this set of Policies will be evaluated on a managed cluster in GCP only.

apiVersion: cluster.open-cluster-management.io/v1beta1
kind: Placement
metadata:
name: restore
namespace: policies
spec:
predicates:
- requiredClusterSelector:
labelSelector:
matchExpressions:
- {key: name, operator: NotIn, values: ["local-cluster"]}
claimSelector:
matchExpressions:
- {key: platform.open-cluster-management.io, operator: In, values: ["GCP"]}

Login to the managed cluster and confirm the restored data has been written to cloud-agnostic volume created by Ceph RBD Storage Class. There will be additional datestamp due to the container being restarted.

$ oc -n hello-openshift get pod,pvc
NAME READY STATUS RESTARTS AGE
pod/hello-openshift-c86f7f48b-8fr8k 1/1 Running 0 31s

NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/data-pvc Bound pvc-14d6edb4-e097-4078-a2fc-e2010e9516e6 1Gi RWO ocs-storagecluster-ceph-rbd 32s

$ oc -n hello-openshift rsh hello-openshift-c86f7f48b-8fr8k cat /data/hello-openshift.txt
Mon Sep 4 03:36:05 UTC 2023 Hello OpenShift!
Mon Sep 4 03:49:55 UTC 2023 Hello OpenShift!

In order to productionize the above consider replacing the Backup with a Schedule resource which will periodically (once every 5 minutes in the example given) create a backup that is written to the hybrid cloud bucket.

apiVersion: velero.io/v1
kind: Schedule
metadata:
name: hello-openshift
namespace: openshift-adp
spec:
schedule: '*/5 * * * *'
template:
hooks: {}
includedNamespaces:
- hello-openshift
storageLocation: velero-dpa-1
ttl: 720h0m0s

The corresponding resource status validation can also be performed via Policy.

apiVersion: velero.io/v1
kind: Schedule
metadata:
name: hello-openshift
namespace: openshift-adp
status:
phase: Enabled

Note that this only confirms that Schedules are in effect but says nothing about the outcome of individual backups. To do so requires the use of raw object template processing to iterate over an ever-growing list of backups and filter for failures (indicated by a backup with a status that is not "Completed"). Such occurances should trigger policy violation alerts that Observability tools including the Alert manager can action and thus we raise the severity for this Policy to critical. For more details on raw object template processing please refer to the documentation.

apiVersion: policy.open-cluster-management.io/v1
kind: ConfigurationPolicy
metadata:
name: failed-scheduled-backups
spec:
remediationAction: inform
severity: critical
object-templates-raw: |
{{- range $backup := (lookup "velero.io/v1" "Backup" "openshift-adp" "").items }}
{{- if not (eq $backup.status.phase "Completed") }}
- complianceType: mustnothave
objectDefinition:
apiVersion: velero.io/v1
kind: Backup
metadata:
name: {{ $backup.metadata.name }}
namespace: {{ $backup.metadata.namespace }}
{{- end }}
{{- end }}

As a final note it is recommended to periodically test the integrity of a random backup by performing a point-in-time restoration so that in the event of a real business continuity scenario confidence in the backup process has been well-established.

Summary

For organizations operating in industries that are deemed to be of systemic importance to society, it is imperative that they adopt a multi-cloud architecture to protect their IT systems against the catastrophic failure of a single cloud platform. Organizations can deliver on this by building their applications with tools from OpenShift Platform Plus so that their applications can readily failover from one cloud platform to another without needing to rearchitect their applications.


About the author

Browse by channel

automation icon

Automation

The latest on IT automation for tech, teams, and environments

AI icon

Artificial intelligence

Updates on the platforms that free customers to run AI workloads anywhere

open hybrid cloud icon

Open hybrid cloud

Explore how we build a more flexible future with hybrid cloud

security icon

Security

The latest on how we reduce risks across environments and technologies

edge icon

Edge computing

Updates on the platforms that simplify operations at the edge

Infrastructure icon

Infrastructure

The latest on the world’s leading enterprise Linux platform

application development icon

Applications

Inside our solutions to the toughest application challenges

Original series icon

Original shows

Entertaining stories from the makers and leaders in enterprise tech