Red Hat blog

Extending the Multicluster Scheduling Capabilities with Open Cluster Management Placement

September 15, 2022Qing Hao

Background

Open Cluster Management (OCM) is a community-driven project focused on multicluster and multicloud scenarios for Kubernetes applications. In OCM, the multicluster scheduling capabilities are provided by Placement. As we have talked about in the previous article Using the Open Cluster Management Placement for Multicluster Scheduling, you can use Placement to filter clusters by label or claim selector. Placement also provides some default prioritizers which can be used to sort and select the most suitable clusters.

One of the default prioritizers is ResourceAllocatableCPU and ResourceAllocatableMemory. They provide the capability to sort clusters based on the allocatable CPU and memory. However, when considering the resource-based scheduling, the limitation is that "AllocatableCPU" and "AllocatableMemory" are static values and don't change, even if the cluster is running out of resources. And in some cases, the prioritizer needs more extra data to calculate the score of the managed cluster. For example, there is a requirement to schedule based on resource monitoring data from the cluster. For this reason, we need a more extensible way to support scheduling based on customized scores.

The following features introduced in this article are based on Open Cluster Management v0.7.0 and also delivered in Red Hat Advanced Cluster Management for Kubernetes 2.5.

What is Placement extensible scheduling?

OCM Placement introduces the AddOnPlacementScore API to support scheduling based on customized scores. This API can be used by Placement and can store the customized scores. For more details on the definitions of AddOnPlacementScore, see types_addonplacementscore.go. See the following AddOnPlacementScore example:

apiVersion: cluster.open-cluster-management.io/v1alpha1
kind: AddOnPlacementScore
metadata:
  name: default
  namespace: cluster1
status:
  conditions:
  - lastTransitionTime: "2021-10-28T08:31:39Z"
    message: AddOnPlacementScore updated successfully
    reason: AddOnPlacementScoreUpdated
    status: "True"
    type: AddOnPlacementScoreUpdated
  validUntil: "2021-10-29T18:31:39Z"
  scores:
  - name: "cpuAvailable"
    value: 66
  - name: "memAvailable"
    value: 55

conditions: Contains the different condition statuses for this AddOnPlacementScore.
validUntil: Defines the valid time of the scores. After this time, the scores are considered to be invalid by placement. Nil means no expiration. The controller owning this resource should keep the scores up-to-date.
scores: Contains a list of score names and values of this managed cluster. In the above example, the API contains a list of customized scores: cpuAvailable and memAvailable.

All the customized score information is stored in status, as we don't expect users to update it.

As a score provider, a third-party controller could run on either the hub or managed cluster to maintain the lifecycle of AddOnPlacementScore and update the score in status.
As a user, you need to know the resource name default and customized score name cpuAvailable and memAvailable to specify the name in the placement YAML to select clusters. For example, the followinng placement selects the top three clusters with the highest cpuAvailable score:

 apiVersion: cluster.open-cluster-management.io/v1beta1
 kind: Placement
 metadata:
   name: placement
   namespace: ns1
 spec:
   numberOfClusters: 3
   prioritizerPolicy:
     mode: Exact
     configurations:
       - scoreCoordinate:
           type: AddOn
           addOn:
             resourceName: default
             scoreName: cpuAvailable
         weight: 1

In Placement, if the user defines the scoreCoordinate type as AddOn, the Placement controller will get the AddOnPlacementScore resource with the name "default" in each cluster's namespace, read score "cpuAvailable" in the score list, and use that score to sort clusters.

You can refer to the enhancements to learn more about the design. In the design, lifecycle maintenance (create, update, and delete) of the AddOnPlacementScore custom resource is not covered, as we expect the customized score provider itself to manage it. In this article, we use an example to show you how to implement a third-party controller to update your own scores and extend the multiple clusters scheduling capability with your own scores.

How to implement a customized score provider

The example code is in the resource-usage-collect GitHub repository. It provides the score of the cluster's available CPU and memory, which can reflect the cluster’s real-time resource utilization. It is developed with OCM addon-framework and can be installed as an add-on plugin to update customized scores in AddOnPlacementScore. See Add-on Developer Guide to learn more about how to develop an addon.

The resource-usage-collect add-on follows the hub-agent architecture as below.

The resource-usage-collect add-on contains a controller and an agent.

The resource-usage-collect-controller runs on the hub cluster. It is responsible for creating the ManifestWork for resource-usage-collect-agent in each cluster namespace.
On each managed cluster, the work agent watches the ManifestWork and installs the resource-usage-collect-agent on each cluster. The resource-usage-collect-agent is the core part of this addon; it creates AddonPlacementScore for each cluster on the Hub cluster and refreshes scores and validUntil every 60 seconds.

When the AddonPlacementScore is ready, you can specify the customized score in a Placement to select clusters.

The workflow and logic of the resource-usage-collect add-on are easy to understand. The following steps will help you get started:

Prepare an OCM environment with 2 ManagedClusters

Run the setup dev environment by kind sript to prepare an environment by running the following command:

curl -sSL https://raw.githubusercontent.com/open-cluster-management-io/OCM/main/solutions/setup-dev-environment/local-up.sh | bash

Run the following command to confirm that two ManagedCluster and a default ManagedClusterSet were created:

$ clusteradm get clusters
NAME       ACCEPTED   AVAILABLE   CLUSTERSET   CPU   MEMORY       KUBERENETES VERSION
cluster1   true       True        default      24    49265496Ki   v1.23.4
cluster2   true       True        default      24    49265496Ki   v1.23.4

$ clusteradm get clustersets
NAME      BOUND NAMESPACES   STATUS
default                      2 ManagedClusters selected

Run the following commands to bind the default ManagedClusterSet to the default namespace:

clusteradm clusterset bind default --namespace default

$ clusteradm get clustersets
NAME      BOUND NAMESPACES   STATUS
default   default            2 ManagedClusters selected

Install the resource-usage-collect add-on

Run the following command to git clone the source code:

git clone git@github.com:JiahaoWei-RH/resource-usage-collect.git 
cd resource-usage-collect

Run the following command to prepare the image:

# get imagebuilder first
go get github.com/openshift/imagebuilder/cmd/imagebuilder@v1.2.1
export PATH=$PATH:$(go env GOPATH)/bin
# build image
make images

Run the following command to deploy the resource-usage-collect add-on:

make deploy

Run the following commands to verify the installation:

On the hub cluster, verify that the resource-usage-collect-controller pod is running.

$ kubectl get pods -n open-cluster-management | grep resource-usage-collect-controller
resource-usage-collect-controller-55c58bbc5-t45dh   1/1     Running   0          71s

On the hub cluster, verify that the AddonPlacementScore is generated for each managed cluster.

$ kubectl get addonplacementscore -A
NAMESPACE   NAME                   AGE
cluster1    resource-usage-score   3m23s
cluster2    resource-usage-score   3m24s

The AddonPlacementScore status should contain a list of scores as follows:

$ kubectl get addonplacementscore -n cluster1 resource-usage-score -oyaml
apiVersion: cluster.open-cluster-management.io/v1alpha1
kind: AddOnPlacementScore
metadata:
  creationTimestamp: "2022-08-08T06:46:04Z"
  generation: 1
  name: resource-usage-score
  namespace: cluster1
  resourceVersion: "3907"
  uid: 6c4280e4-38be-4d45-9c73-c18c84799781
status:
  scores:
  - name: cpuAvailable
    value: 12
  - name: memAvailable
    value: 4

If AddonPlacementScore is not created or there are no scores in the status, go into the managed cluster and check if the resource-usage-collect-agent pod is running well by running the following command:

$ kubectl get pods -n default | grep resource-usage-collect-agent
resource-usage-collect-agent-5b85cbf848-g5kqm   1/1     Running   0          2m

Select clusters with the customized scores

If everything is running correctly, you can try to create a Placement and select clusters with the customized scores.

Create a Placement to select one cluster with the highest cpuAvailable score.

cat << EOF | kubectl apply -f -
apiVersion: cluster.open-cluster-management.io/v1beta1
kind: Placement
metadata:
  name: placement1
  namespace: default
spec:
  numberOfClusters: 1
  clusterSets:
    - default
  prioritizerPolicy:
    mode: Exact
    configurations:
      - scoreCoordinate:
          type: AddOn
          addOn:
            resourceName: resource-usage-score
            scoreName: cpuAvailable
        weight: 1
EOF

Verify the Placement decision.

$ kubectl describe placementdecision -n default | grep Status -A 3
Status:
  Decisions:
    Cluster Name:  cluster1
    Reason:

Cluster1 is selected by PlacementDecision.

Run the following command to get the customized score in AddonPlacementScore and the cluster score set by Placement. You can see that the cpuAvailable score is 12 in AddonPlacementScore. This value is also the cluster score in Placement events, which indicates that the Placement is using the customized score to select clusters.

$ kubectl get addonplacementscore -A -o=jsonpath='{range .items[*]}{.metadata.namespace}{"\t"}{.status.scores}{"\n"}{end}'
cluster1        [{"name":"cpuAvailable","value":12},{"name":"memAvailable","value":4}]
cluster2        [{"name":"cpuAvailable","value":12},{"name":"memAvailable","value":4}]

$ kubectl describe placement -n default placement1 | grep Events -A 10
Events:
  Type    Reason          Age   From                 Message
  ----    ------          ----  ----                 -------
  Normal  DecisionCreate  50s   placementController  Decision placement1-decision-1 is created with placement placement1 in namespace default
  Normal  DecisionUpdate  50s   placementController  Decision placement1-decision-1 is updated with placement placement1 in namespace default
  Normal  ScoreUpdate     50s   placementController  cluster1:12 cluster2:12

Now you know how to install the resource-usage-collect add-on and consume the customized score to select clusters. Next, let's take a deeper look at some key points when you consider implementing a customized score provider.

Where to run the customized score provider

The customized score provider could run on either the hub or managed cluster. Combined with user stories, you should be able to tell whether the controller should be placed in a hub or a managed cluster.

In our example, the customized score provider is developed with addon-famework, which follows the hub-agent architecture. The resource-usage-collect-agent is the real score provider. It is installed on each managed cluster and retrieves the available CPU and memory of the managed cluster, calculates a score, and updates it in AddonPlacementScore. The resource-usage-collect-controller just takes care of installing the agent.

In other cases, for example, if you want to use the metrics from Thanos to calculate a score for each cluster, then the customized score provider only needs to be placed on the hub, as Thanos has all the metrics collected from each managed cluster.

How to maintain the AddOnPlacementScore CR lifecycle

In our example, the code to maintain the AddOnPlacementScore CR is in pkg/addon/agent/agent.go.

When should the score be created?

The AddOnPlacementScore CR can be created with the existence of a ManagedCluster or on demand for the purpose of reducing objects on the hub.

In our example, the add-on creates a AddOnPlacementScore for each managed cluster if it does not exist and a score is calculated when creating the CR for the first time.
When should the score be updated?

We recommend that you set ValidUntil when updating the score so that the Placement controller can know if the score is still valid in case it failed to update for a long time.

The score could be updated when your monitoring data changes, or when you need to update it before it expires.

In our example, in addition to recalculating and updating the score every 60 seconds, the update will also be triggered when the node or pod resource in the managed cluster changes.

How to calculate the score

The code to calculate the score is in pkg/addon/agent/calculate.go. A valid score must be in the range between -100 and 100. You need to normalize the scores before updating them in AddOnPlacementScore.

When normalizing the score, you might run into the following issues:

The score provider knows the max and min value of the customized scores.

In this case, it is easy to achieve smooth mapping by using a formula. If the actual value is X, and X is in the interval [min, max], then score ＝ 200 * (x - min) / (max - min) - 100
The score provider doesn't know the max and min value of the customized scores.

In this case, you need to set a max and min value by yourself, as without a max and min value, it is not possible to map a single value X to the range [-100, 100].

When X is greater than this max value, the cluster can be considered healthy enough to deploy applications, and the score can be set as 100. And if X is less than the min value, the score can be set as -100.

if X >= max
  score = 100
if X <= min 
  score = -100

In our example, the resource-usage-collect-agent running on each managed cluster doesn't have a holistic view to know the max/min value of the CPU/memory usage of all the clusters, so we manually set the max value as MAXCPUCOUNT and MAXMEMCOUNT in the code, and the min value is set as 0. The score calculation formula can be simplified as follows: score = x / max * 100

Summary

In this article, we introduced what Placement extensible scheduling is and used an example to show how to implement a customized score provider. This article also listed three key points the developer needs to consider when implementing a third-party score provider. After reading this article, you should have a clear view of how Placement extensible scheduling can help you extend the multicluster scheduling capabilities.

All the features introduced in this article are based on Open Cluster Management v0.7.0 and also delivered in Red Hat Advanced Cluster Management for Kubernetes 2.5. The latest features will keep updating in Extend the multicluster scheduling capabilities with placement.

Feel free to ask questions in the Open-cluster-management-io GitHub community or contact us by using Slack.

About the author

Qing Hao

Read full bio

Platform products

Try & buy

Featured cloud services

By category

By organization type

By customer

Services

Training & certification

Featured

Topics

Articles

More to explore

For customers

For partners

About us

Open source

Company details

Communities

Recommendations

Select a language

Select a language

Extending the Multicluster Scheduling Capabilities with Open Cluster Management Placement

Background

What is Placement extensible scheduling?

How to implement a customized score provider

Prepare an OCM environment with 2 ManagedClusters

Install the resource-usage-collect add-on

Select clusters with the customized scores

Where to run the customized score provider

How to maintain the AddOnPlacementScore CR lifecycle

How to calculate the score

Summary

About the author

Qing Hao

Products

Tools

Try, buy, & sell

Communicate

About Red Hat

Select a language

Red Hat legal and privacy links

Red Hat legal and privacy links