ROSA - Federating Metrics to AWS Prometheus

Last edited: April 4, 2025
Published: April 3, 2025
Authors: Kevin Collins,; Paul Czarkowski

Tags:

This content is authored by Red Hat experts, but has not yet been tested on every supported configuration.

Federating Metrics from ROSA is a bit tricky as the cluster metrics require pulling from its /federated endpoint while the user workload metrics require using the prometheus remoteWrite configuration.

This guide will walk you through using the MOBB Helm Chart to deploy the necessary agents to federate the metrics into AWS Prometheus and then use Grafana to visualize those metrics.

As a bonus it will set up a CloudWatch datasource to view any metrics or logs you have in Cloud Watch.

Make sure to use a region where Amazon Prometheus service is supported

Prerequisites

Set up environment

Create environment variables

export CLUSTER=my-cluster
export REGION=us-east-2
export PROM_NAMESPACE=custom-metrics
export PROM_SA=aws-prometheus-proxy
export SCRATCH_DIR=/tmp/scratch
export OIDC_PROVIDER=$(oc get authentication.config.openshift.io cluster -o json | jq -r .spec.serviceAccountIssuer| sed -e "s/^https:\/\///")
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
export AWS_PAGER=""
mkdir -p $SCRATCH_DIR

Create namespace
```
oc new-project $PROM_NAMESPACE
```

Deploy Operators

Add the MOBB chart repository to your Helm

helm repo add mobb https://rh-mobb.github.io/helm-charts/

Update your repositories
```
helm repo update
```

Use the mobb/operatorhub chart to deploy the needed operators

helm upgrade -n $PROM_NAMESPACE custom-metrics-operators \
  mobb/operatorhub --install \
  --values https://raw.githubusercontent.com/rh-mobb/helm-charts/main/charts/rosa-aws-prometheus/files/operatorhub.yaml

Wait for the Operator to install

oc rollout status deployment grafana-operator-controller-manager-v5

You should see the following after a few minutes

deployment "grafana-operator-controller-manager-v5" successfully rolled out

Deploy and Configure the AWS Sigv4 Proxy and the Grafana Agent

Create a Policy for access to AWS CloudWatch

cat <<EOF > $SCRATCH_DIR/PermissionPolicyCloudWatch.json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowReadingMetricsFromCloudWatch",
            "Effect": "Allow",
            "Action": [
                "cloudwatch:DescribeAlarmsForMetric",
                "cloudwatch:DescribeAlarmHistory",
                "cloudwatch:DescribeAlarms",
                "cloudwatch:ListMetrics",
                "cloudwatch:GetMetricStatistics",
                "cloudwatch:GetMetricData"
            ],
            "Resource": "*"
        },
        {
            "Sid": "AllowReadingLogsFromCloudWatch",
            "Effect": "Allow",
            "Action": [
                "logs:DescribeLogGroups",
                "logs:GetLogGroupFields",
                "logs:StartQuery",
                "logs:StopQuery",
                "logs:GetQueryResults",
                "logs:GetLogEvents"
            ],
            "Resource": "*"
        },
        {
            "Sid": "AllowReadingTagsInstancesRegionsFromEC2",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeTags",
                "ec2:DescribeInstances",
                "ec2:DescribeRegions"
            ],
            "Resource": "*"
        },
        {
            "Sid": "AllowReadingResourcesForTags",
            "Effect": "Allow",
            "Action": "tag:GetResources",
            "Resource": "*"
        }
    ]
}
EOF

Apply the Policy

CW_POLICY=$(aws iam create-policy --policy-name $PROM_SA-cw \
  --policy-document file://$SCRATCH_DIR/PermissionPolicyCloudWatch.json \
  --query 'Policy.Arn' --output text)
echo $CW_POLICY

Create a Trust Policy

cat <<EOF > $SCRATCH_DIR/TrustPolicy.json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "${OIDC_PROVIDER}:sub": [
            "system:serviceaccount:${PROM_NAMESPACE}:${PROM_SA}",
            "system:serviceaccount:${PROM_NAMESPACE}:grafana-sa"
          ]
        }
      }
    }
  ]
}
EOF

Create Role for AWS Prometheus and CloudWatch

PROM_ROLE=$(aws iam create-role \
  --role-name "prometheus-$CLUSTER" \
  --assume-role-policy-document file://$SCRATCH_DIR/TrustPolicy.json \
  --query "Role.Arn" --output text)
echo $PROM_ROLE

Attach the Policies to the Role

Note: this policy is very permissive, you may want to restrict access for production use cases.

aws iam attach-role-policy \
  --role-name "prometheus-$CLUSTER" \
   --policy-arn arn:aws:iam::aws:policy/AmazonPrometheusFullAccess


aws iam attach-role-policy \
  --role-name "prometheus-$CLUSTER" \
  --policy-arn $CW_POLICY

Create an AWS Prometheus Workspace

PROM_WS=$(aws amp create-workspace --alias $CLUSTER \
  --query "workspaceId" --output text)
echo $PROM_WS

Deploy AWS Prometheus Proxy Helm Chart

helm upgrade --install -n $PROM_NAMESPACE --set "aws.region=$REGION" \
--set "aws.roleArn=$PROM_ROLE" --set "fullnameOverride=$PROM_SA" \
--set "aws.workspaceId=$PROM_WS" \
--set "rosa.clusterName=$CLUSTER" \
--set "grafana-cr.serviceAccountAnnotations.eks\.amazonaws\.com/role-arn=$PROM_ROLE" \
--set "grafana-cr.sigv4Proxy.region=$REGION" \
 aws-prometheus-proxy mobb/rosa-aws-prometheus

Configure remoteWrite for user workloads

cat << EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: user-workload-monitoring-config
  namespace: openshift-user-workload-monitoring
data:
  config.yaml: |
    prometheus:
      remoteWrite:
        - url: "http://aws-prometheus-proxy-alloy.$PROM_NAMESPACE.svc.cluster.local:9999/api/v1/metrics/write"
EOF

Verify Metrics are being collected

Access Grafana and check for metrics

echo "https://$(oc get route -n custom-metrics grafana-route -o jsonpath='{.status.ingress[0].host}')"

Browse to the URL provided in the above command and log in with your OpenShift Credentials
Enable Admin by hitting sign in and user admin and password

Browse to /datasources and verify that cloudwatch and prometheus are present

If not, you may have hit a race condition that can be fixed by running the following then trying again

kubectl delete grafanadatasources.integreatly.org aws-prometheus-proxy-prometheus
helm upgrade --install -n $PROM_NAMESPACE --set "aws.region=$REGION" \
  --set "aws.roleArn=$PROM_ROLE" --set "fullnameOverride=$PROM_SA" \
  --set "aws.workspaceId=$PROM_WS" \
  --set "grafana-cr.serviceAccountAnnotations.eks\.amazonaws\.com/role-arn=$PROM_ROLE" \
  aws-prometheus-proxy mobb/rosa-aws-prometheus

Browse to /dashboards and select the aws-prometheus-proxy -> NodeExporter / Use Method / Cluster dashboard

Cleanup

Delete the aws-prometheus-proxy Helm Release

helm delete -n custom-metrics aws-prometheus-proxy

Delete the custom-metrics-operators Helm Release

helm delete -n custom-metrics custom-metrics-operators

Delete the custom-metrics namespace

kubectl delete namespace custom-metrics

Detach AWS Role Policies

aws iam detach-role-policy \
  --role-name "prometheus-$CLUSTER" \
  --policy-arn arn:aws:iam::aws:policy/AmazonPrometheusFullAccess



aws iam detach-role-policy \
  --role-name "prometheus-$CLUSTER" \
  --policy-arn $CW_POLICY

Delete the custom Cloud Watch Policy

aws iam delete-policy --policy-arn $CW_POLICY

Delete the AWS Prometheus Role

aws iam delete-role --role-name "prometheus-$CLUSTER"

Delete AWS Prometheus Workspace

aws amp delete-workspace --workspace-id $PROM_WS

ROSA - Federating Metrics to AWS Prometheus

Prerequisites

Set up environment

Deploy Operators

Deploy and Configure the AWS Sigv4 Proxy and the Grafana Agent

Verify Metrics are being collected

Cleanup

Interested in contributing to these docs?

Products

Tools

Try, buy & sell

Communicate

About Red Hat

Subscribe to our newsletter, Red Hat Shares

Red Hat legal and privacy links

Red Hat legal and privacy links