Configuring Cluster Observability Operator (COO) in ARO and Enabling remote writing of metrics to Azure Monitor Workspace
This content is authored by Red Hat experts, but has not yet been tested on every supported configuration.
The Cluster Observability Operator (COO) is an optional OpenShift Container Platform Operator that enables administrators to create standalone monitoring stacks that are independently configurable for use by different services and users.
Deploying COO helps you address monitoring requirements that are hard to achieve using the default monitoring stack. COO is ideal for users who need high customizability, scalability, and long-term data retention, especially in complex, multi-tenant enterprise environments.
This guide will walk users through an example of how to use the COO to set up a highly available Prometheus instance that persists metrics, and enables remote writes of metrics to an Azure Monitor Prometheus
Prerequisites
- You have access to the cluster as a cluster-admin
- Follow the OpenShift documentation for Installing the Cluster Observability operator
Set up Azure Monitor Workspace/Grafana
Create Azure Monitor Workspace
Note: In this tutorial, all of the Azure resources will be created in the same resource group as the cluster
- Follow the Microsoft documentation to Create an Azure Monitor Workspace
Create and Link Azure Managed Grafana
- Follow the Microsoft documentation to Create an Azure Managed Grafana Workspace
- Proceed to link the Grafana instance to your Azure Monitor workspace following the Microsoft documentation Link a grafana workspace
Create a Service Principal for use with Azure Monitor Workspace
SERVICE_PRINCIPAL_CLIENT_SECRET="$(az ad sp create-for-rbac --name cfung-azure-monitor-workspace --query 'password' -otsv)"
SERVICE_PRINCIPAL_CLIENT_ID="$(az ad sp list --display-name cfung-azure-monitor-workspace --query '[0].appId' -otsv)"
Give the Service Principal permisssions to publish metrics to your Azure Monitor Workspace
Go to the Azure portal. On the resource menu for your Azure Monitor workspace, select Overview. For Data collection rule, select the link.
On the resource menu for the data collection rule, select Access control (IAM). Select + Add, and then select Add role assignment
Select the Monitoring Metrics Publisher role, and then select Next
Select User, group, or service principal, and then choose +select members. Select the Service principal name that you created in the previous step, and then choose Select.
To complete the role assignment, select Review + assign.
Configuring the Cluster Observability Operator to monitor services
Create a project for COO monitoring resources
oc new-project centralized-monitoring
Create a secret for COO prometheus to use to ship metrics to Azure Monitor prometheus
oc create secret generic azuremonitor-secret \ -n centralized-monitoring \ --from-literal clientid=${SERVICE_PRINCIPAL_CLIENT_ID} \ --from-literal clientsecret=${SERVICE_PRINCIPAL_CLIENT_SECRET}
Create 2 additonal projects and label them. These will be used to deploy 2 sample applications
oc new-project test1 oc new-project test2 oc label namespace test1 monitored=enabled oc label namespace test2 monitored=enabled
Deploy Sample app and ServiceMonitor in namespace test1
Deploy a sample application in project test1
cat <<EOF | oc apply -f - apiVersion: apps/v1 kind: Deployment metadata: labels: app: prometheus-coo-example-app name: prometheus-coo-example-app namespace: test1 spec: replicas: 1 selector: matchLabels: app: prometheus-coo-example-app template: metadata: labels: app: prometheus-coo-example-app spec: containers: - image: ghcr.io/rhobs/prometheus-example-app:0.4.2 name: prometheus-coo-example-app --- apiVersion: v1 kind: Service metadata: labels: app: prometheus-coo-example-app name: prometheus-coo-example-app namespace: test1 spec: ports: - port: 8080 protocol: TCP targetPort: 8080 name: web selector: app: prometheus-coo-example-app type: ClusterIP EOF
Verify that the pod is running
oc -n test1 get pod
Create a ServiceMonitor object to specify how the service created above is to be monitored
cat <<EOF | oc apply -f - apiVersion: monitoring.rhobs/v1 kind: ServiceMonitor metadata: name: prometheus-coo-example-monitor labels: monitoredby: prometheus-coo namespace: test1 spec: endpoints: - port: web scheme: http interval: 30s selector: matchLabels: app: prometheus-coo-example-app EOF
Deploy Sample app and ServiceMonitor in namespace test2
Deploy sample app using CRD below
cat <<EOF | oc apply -f - apiVersion: apps/v1 kind: Deployment metadata: name: coo-example-app namespace: test2 spec: replicas: 1 selector: matchLabels: app: coo-example-app template: metadata: labels: app: coo-example-app spec: containers: - name: httpserver image: registry.access.redhat.com/ubi9/python-311:1 command: - bash - -c - |2- pip install prometheus_client && python - <<EOF from http.server import BaseHTTPRequestHandler, HTTPServer from prometheus_client import start_http_server, Counter class HTTPRequestHandler(BaseHTTPRequestHandler): def do_GET(self): if self.path == '/': self.send_response(200) self.end_headers() self.wfile.write(b'<html>Hello!</html>\n') respCtr.labels(response='200').inc() else: self.send_error(404) respCtr.labels(response='404').inc() start_http_server(9000) respCtr = Counter('coo_responses','Responses',["response"]) HTTPServer(("", 8080), HTTPRequestHandler).serve_forever() EOF --- kind: Service apiVersion: v1 metadata: name: coo-example-app namespace: test2 labels: app: coo-example-app spec: ports: - name: http port: 8080 - name: metrics port: 9000 selector: app: coo-example-app EOF
Verify that the pod is running
oc -n test2 get pod
Create a ServiceMonitor object to specify how the service created above is to be monitored
cat <<EOF | oc apply -f - apiVersion: monitoring.rhobs/v1 kind: ServiceMonitor metadata: name: coo-example-app labels: monitoredby: prometheus-coo namespace: test2 spec: endpoints: - port: metrics interval: 30s selector: matchLabels: app: coo-example-app EOF
Create a Cluster Observability Operator MonitoringStack
Use the CRD below to create the COO MonitoringStack(the equivalent of a Prometheus stack) in the
centralized-monitoring project
. The MonitoringStack will be used to monitor the applications created above. The configuration creates a single Prometheus replica, retains metrics for 5 days, adds persistent storage to the Prometheus instance, and enables the remote writing of metrics to your Azure Monitor workspace.Replace {{INGESTION-URL}} value below with the Metrics ingestion endpoint from the Overview page for the Azure Monitor workspace.
TENANT_ID=$(az account get-access-token --query tenant --output tsv) cat <<EOF | oc apply -f - apiVersion: monitoring.rhobs/v1alpha1 kind: MonitoringStack metadata: name: prometheus-coo namespace: centralized-monitoring labels: coo: prometheus-coo spec: logLevel: debug retention: 5d namespaceSelector: matchLabels: monitored: enabled resourceSelector: matchLabels: monitoredby: prometheus-coo prometheusConfig: persistentVolumeClaim: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi storageClassName: managed-csi volumeMode: Filesystem remoteWrite: - url: "{{INGESTION-URL}}" oauth2: clientId: secret: name: azuremonitor-secret key: clientid clientSecret: name: azuremonitor-secret key: clientsecret tokenUrl: "https://login.microsoftonline.com/$TENANT_ID/oauth2/v2.0/token" scopes: - "https://monitor.azure.com/.default" replicas: 1 #For higher availability, set 2 replicas scrapeInterval: 30s EOF
Confirm your Prometheus instance is running successfully by running the following command:
oc get pods -n centralized-monitoring
Example output
NAME READY STATUS RESTARTS AGE alertmanager-prometheus-coo-0 2/2 Running 0 2m42s alertmanager-prometheus-coo-1 2/2 Running 0 2m42s prometheus-prometheus-coo-0 3/3 Running 0 2m43s
Note that 2 alertmanager instances are also created, which you can configure to set up custom alerting. We will not be covering that in this guide.
To disable alertmanager, set the following in the spec:
section of the COO MonitoringStack CRD
spec:
alertmanagerConfig:
disabled: true
Validating the Monitoring Stack
Generate Metrics
Expose your applications by creating routes
oc expose svc prometheus-coo-example-app -n test1 oc expose svc coo-example-app -n test2
Access the routes from your browser or terminal to generate metrics
Run the command below to get the route url
oc get route prometheus-coo-example-app -n test1 oc get route coo-example-app -n test2
Use the
HOST/PORT
output to access the applications and generate metricsExecute a query on the Prometheus pod to return the total HTTP request metric
oc -n centralized-monitoring exec -c prometheus prometheus-prometheus-coo-0 -- curl -s 'http://localhost:9090/api/v1/query?query=http_requests_total'
Example output
{ "status": "success", "data": { "resultType": "vector", "result": [ { "metric": { "__name__": "http_requests_total", "code": "200", "endpoint": "web", "instance": "10.131.0.18:8080", "job": "prometheus-coo-example-app", "method": "get", "namespace": "test1", "pod": "prometheus-coo-example-app-6d57b4d844-255cr", "service": "prometheus-coo-example-app" }, "value": [ 1744876084.1, "7" ] }, { "metric": { "__name__": "http_requests_total", "code": "404", "endpoint": "web", "instance": "10.131.0.18:8080", "job": "prometheus-coo-example-app", "method": "get", "namespace": "test1", "pod": "prometheus-coo-example-app-6d57b4d844-255cr", "service": "prometheus-coo-example-app" }, "value": [ 1744876084.1, "0" ] } ] }
View Metrics in Azure Monitor
Navigate to your Azure Monitor Worskspace and run the following PromQL query in the Prometheus explorer
coo_responses_total
Example output
View Metrics in Grafana
In the Azure portal, navigate to your Azure Grafana workspace. In the Overview tab’s Essentials section, select the Endpoint URL to access your Grafana instance. Single sign-on via Microsoft Entra ID has been configured for you automatically
Go to the Explore tab to begin viewing metrics in grafana