Operator Metering is a chargeback and reporting tool to provide accountability for how resources are used across a Kubernetes cluster. Cluster admins can schedule reports based on historical usage data by Pod, Namespace, and Cluster wide. 

There are many out-of-the-box report queries that are made available when you install the metering operator. For example, if an admin wants to measure the CPU (or memory) usage for the cluster nodes (or pods), then all you need to do is install the metering operator and write a Report custom resource to produce a report (monthly, hourly, and others).

Use Cases

There are a few requirements I’m hearing consistently from customers. 

  1. The worker nodes in an OpenShift cluster for non-prod environments may not be running all the time (for example, an administrator wants to switch off a few worker nodes based on available capacity or during weekends). Therefore, they would like to measure CPU (or memory) usage of a node on a monthly basis so that the infrastructure team could charge back the users based on actual node utilization.
  2. Customers have a deployment model where they deploy a dedicated OpenShift cluster for specific teams. They could install the metering operator and get the node usage; however, they want to include which team (or lob) the worker nodes are used for in the chargeback report. This use case may also be extended to cases in which a shared dedicated cluster has “labeled” nodes and each label identifies which lob’s/team’s workload runs on the nodes.
  3. In a shared cluster, the operations team would like to chargeback teams based on pods’ running time (may be CPU or memory consumption). Again, they want to simplify the report to include which team (or lob) the pods belong to.

These requirements lead to creating a few custom resources in the cluster, and we will learn how to do that easily in this article. Please note that installing a metering operator is beyond the scope of this article. You may refer to the installation documentation here. To learn more about how to use the out-of-the-box reports, refer to the documentation here.

How Does Metering work?

Let’s decode a bit on how OpenShift Metering works before we move on to creating new custom resources for our use case as mentioned in the previous section. There are a total of six  custom resources that the metering operator creates once it is installed. Out of six, the following need a bit more explanation. 

  1. ReportDataSources (rds): rds is the mechanism to define what data is available and can be used by ReportQuery or Report custom resources. It also allows fetching data from several  sources. In OpenShift, the data gets pulled from Prometheus as well as ReportQuery (rq) custom resource. 
  2. ReportQuery (rq): rq contains the SQL queries to perform analysis on data stored with rds. If a rq object is referenced by a Report object, then rq object will also manage what it will be reporting on when the report is run. If referenced by a rds object, rq object will instruct metering to create a view within Presto tables (created as part of metering installation) based on the rendered query.
  3. Report: This custom resource causes reports to be generated using the configured ReportQuery resource. This is the primary resource an end-user of Metering Operator would interact with. Report can be configured to run on a schedule.

There are many out-of-the-box rds and rq available. Since we are focusing on node level metering, I will  show which ones we would need to understand to write our own customized queries. Run the following command while in “openshift-metering” project: 

$ oc project openshift-metering

$ oc get reportdatasources | grep node

node-allocatable-cpu-cores

node-allocatable-memory-bytes

node-capacity-cpu-cores

node-capacity-memory-bytes

node-cpu-allocatable-raw

node-cpu-capacity-raw

node-memory-allocatable-raw

node-memory-capacity-raw 

We would be keen to focus on following two rds: “node-capacity-cpu-cores” and “node-cpu-capacity-raw” as we want to measure CPU consumption. Let’s focus on node-capacity-cpu-cores and run the following command to see how it is collecting data from Prometheus:

$ oc get reportdatasource/node-capacity-cpu-cores -o yaml

<showing only relevant snippet below>

spec:

  prometheusMetricsImporter:

query: |

   kube_node_status_capacity_cpu_cores * on(node) group_left(provider_id) max(kube_node_info) by (node, provider_id)

You could observe the Prometheus query used to fetch data from Prometheus and store it in Presto tables. Let’s run the same query in the OpenShift’s metric console and see the result. I have an OpenShift cluster with 2 worker nodes (each node being 16 cores) and 3 master nodes (each being 8 cores). The last column, “value”, records the cores assigned to the nodes.

 

So, the data gets collected and stored in the Presto tables. Let’s now focus on a couple of reportquery (rq) custom resources: 

$ oc project openshift-metering

$ oc get reportqueries | grep node-cpu 

node-cpu-allocatable                    

node-cpu-allocatable-raw                

node-cpu-capacity                       

node-cpu-capacity-raw                   

node-cpu-utilization                    

We are keen to focus on “node-cpu-capacity” and “node-cpu-capacity-raw” rqs here. You could describe these reportqueries and figure out that they are computing data (as in how long the node is up, how many CPUs are assigned, and so forth) and aggregating data. 

In fact, the below diagram shows the chain on how the two rds and the two rqs are connected.

node-cpu-capacity (rq) uses node-cpu-capacity-raw (rds) uses node-cpu-capacity-raw (rq) uses node-capacity-cpu-cores (rds)

Customizing Reports

Let’s focus on writing our customized rds and rq. First, we need to change the Prometheus query to include whether the node is functioning as a master/worker node and then include an appropriate node label that would identify which team the node belongs to. “kube_node_role” Prometheus metric has data wrt nodel role (as a master or worker). See the “role” column:

“kube_node_labels” prometheus metric captures all the labels applied on a node. All the labels get captured as “label_<label>”. So, for example if I apply a label with key “node_lob” to a node, the prometheus metric will capture the data as “label_node_lob”. 

Now, all we need to do is to modify the original query with these additional couple of Prometheus queries to get relevant data. Here’s what this query looks like:

((kube_node_status_capacity_cpu_cores * on(node) group_left(provider_id) max(kube_node_info) by (node, provider_id)) * on(node) group_left (role) kube_node_role{role='worker'}) * on(node) group_right(provider_id, role) kube_node_labels

Let’s run this query on OpenShift’s metric console and verify we got both the label (node_lob) and role information. Below, we have label_node_lob as an output as well as role (you can’t see role information as the query outputs lots of columns, but it does capture):

So, we will write four  custom resources. I have uploaded those custom resources here for simplicity: 

  1. rds-custom-node-capacity-cpu-cores.yaml:  defines the Prometheus query
  2. rq-custom-node-cpu-capacity-raw.yaml:  refers to the above rds; computes raw data
  3. rds-custom-node-cpu-capacity-raw.yaml: refers to the above rq and creates a view in Presto
  4. rq-custom-node-cpu-capacity-with-cpus-labels.yaml: refers to rds mentioned in No. 3 above and computes data based on input start and end data. Also, this is the file where we retrieve the role and node_label columns.

Once you have written these yaml files, go to openshift-metering project and run the following commands: 

$ oc project openshift-metering

$ oc create -f rds-custom-node-capacity-cpu-cores.yaml 

$ oc create -f rq-custom-node-cpu-capacity-raw.yaml

$ oc create -f rds-custom-node-cpu-capacity-raw.yaml 

$ oc create -f rq-custom-node-cpu-capacity-with-cpus-labels.yaml

Finally, all you need to do now is write a Report custom object that would refer to the last rq object created above. You could write one as shown below. The below report will run immediately and show data between September 15 to 30.

$ cat report_immediate.yaml

apiVersion: metering.openshift.io/v1

kind: Report

metadata:

  name: custom-role-node-cpu-capacity-lables-immediate

  namespace: openshift-metering

spec:

  query: custom-role-node-cpu-capacity-labels

  reportingStart: "2020-09-15T00:00:00Z"

  reportingEnd: "2020-09-30T00:00:00Z"

  runImmediately: true



 

$ oc create -f report-immediate.yaml

Once you run this report, you could download the file (in csv or json) via this URL (change the DOMAIN NAME accordingly):

https://metering-openshift-metering.DOMAIN_NAME/api/v1/reports/get?name=custom-role-node-cpu-capacity-hourly&namespace=openshift-metering&format=csv

The below CSV snapshot shows the captured data that has both the role and node_lob column. The column “node_capacity_cpu_core_seconds” must be divided by “node_capacity_cpu_cores” to arrive at node’s run time in seconds: 

Summary

The  metering operator is pretty cool and can run on OpenShift clusters wherever it is running. It provides an extensible framework so that customers can write their own custom resources to create reports as per their needs. All the code used above is available here.