To understand the Secondary Scheduler Operator, we need to understand the Kubernetes scheduler. The Kubernetes scheduler is an enterprise grade stable component in Kubernetes that decides where to place the incoming pods by a two step operation of filtering and scoring. The Kubernetes scheduler works well for applications that need to schedule pods in a sequence, for example a web application. Now the customer sees the benefits of Kubernetes, like DevOps and portability. They want to transform their specialized workload like HPC, Telco into containers and run it in Kubernetes. The default scheduler that works well for web servers does not work well for these specialized applications because these applications have special needs from the scheduler. For example:

  1. Coscheduling:  Start, execute and finish all pods at same time.
  2. Topology Aware Scheduler: Schedule pods based on node topology.
  3. Load Aware Scheduler: Schedule pods based on load of the nodes.

 So there is a need for new schedulers to run specialized workloads.

OpenShift is Red Hat’s enterprise Kubernetes distribution. This means OpenShift has the same scheduler as upstream Kubernetes, which is stable and enterprise grade but best suited for applications that need to schedule pods in a sequence. We in Red hat always listen to our customers and partners that there is a need to bring in their own scheduler, which can best run their application.

The Secondary Scheduler Operator allows customers or partners to bring in their own scheduler in OpenShift and run their application with that customized scheduler.OpenShift 4.x has been re-architected to be a self hosted platform which uses the same OpenShift constructs that any workload running on OpenShift uses. In order to safeguard the control components from the custom scheduler that the end user (customer) can bring in, we decided to have a secondary scheduler which does not have any impact on the control plane components.  

Architecturally the default scheduler is responsible for schedu  ling all the workloads including control plane components; however if the customer chooses to bring their own scheduler, they can leverage secondary scheduler operator to manage the workloads of their choice but the control plane components would still use the default scheduler shipped with OpenShift.

The following diagram explains the overall flow of how to add your scheduler as payload via the Secondary Scheduler Operator provided by Red Hat.

  There is a separation of responsibility when a customer installs their own scheduler via this operator. Customers are responsible for maintaining their custom scheduler and Red Hat is responsible for the operator.

Example

In this example we will walk you through how easy it is to install the Load aware scheduling plugin via the Secondary Scheduler Operator.

Step 1: Install the Secondary Scheduler Operator

  1. Create a namespace  openshift-secondary-scheduler-operator to install the Secondary Scheduler Operator first.
  2. Open the OperatorHub console to search for the Secondary Scheduler Operator.
  3. Click on Install to install the Secondary Scheduler Operator.
  4. Choose the project namespace as openshift-secondary-scheduler-operator.
  5. Now the operator is ready to use.
$ oc create ns openshift-secondary-scheduler-operator

Step 2: Configure the Secondary Scheduler Operator to install Trimaran scheduler as a secondary scheduler

  1. Create a config.yaml to define KubeSchedulerConfiguration for the Trimaran scheduler that runs TargetLoadPacking plugin. The schedulerName should be set as secondary-scheduler.
  2. Run the following commands to obtain the ${PROM_URL} and ${PROM_TOKEN}, and replace these with the real Prometheus endpoint and token in config.yaml.
  3. Create a ConfigMap secondary-scheduler-config for the Trimaran KubeSchedulerConfiguration under the openshift-secondary-scheduler-operator namespace. 
  4. Click on Create SecondaryScheduler to create an instance of SecondaryScheduler and configure the SecondaryScheduler YAML accordingly.
    1. Configure the schedulerConfig to point to the secondary-scheduler-config ConfigMap.
    2. Configure the schedulerImage to the default scheduler-plugin image that includes Trimaran plugins.
    3. Then click Create to install a secondary scheduler instance that runs the Trimaran scheduler.
  5. Finally, the operator will install a secondary-scheduler deployment running under openshift-secondary-scheduler-operator namespace. Check the logs to verify if the Trimaran scheduler is successfully running.
apiVersion: kubescheduler.config.k8s.io/v1beta1
kind: KubeSchedulerConfiguration
leaderElection:
leaderElect: false
profiles:
- schedulerName: secondary-scheduler
  plugins:
    score:
      disabled:
        - name: NodeResourcesBalancedAllocation
        - name: NodeResourcesLeastAllocated
      enabled:
        - name: TargetLoadPacking
  pluginConfig:
    - name: TargetLoadPacking
      args:
        defaultRequests:
          cpu: "2000m"
        defaultRequestsMultiplier: "1"
        targetUtilization: 70
        metricProvider:
          type: Prometheus
          address: ${PROM_URL}
          Token: ${PROM_TOKEN}
$ export PROM_HOST=`oc get routes prometheus-k8s -n openshift-monitoring -ojson |jq ".status.ingress"|jq ".[0].host"|sed 's/"//g'`
$ export PROM_URL="https://${PROM_HOST}"
TOKEN_NAME=`oc get secret -n openshift-monitoring|awk '{print $1}'|grep prometheus-k8s-token -m 1`
PROM_TOKEN=`oc describe secret $TOKEN_NAME -n openshift-monitoring|grep "token:"|cut -d: -f2|sed 's/^ *//g'`
$ echo $PROM_URL
$ echo $PROM_TOKEN
$ oc create -n openshift-secondary-scheduler-operator configmap secondary-scheduler-config --from-file=config.yaml

Resources