Cloud Experts Documentation

Configure Node Pool Scale-to-Zero on ROSA HCP

This content is authored by Red Hat experts, but has not yet been tested on every supported configuration.

ROSA HCP supports setting min_replicas=0 on node pools with autoscaling enabled. This allows the cluster autoscaler to scale worker nodes down to zero when no workloads require them, and scale back up automatically when pods are scheduled. This is useful for cost optimization on development, testing, or burst-capacity node pools.

In this guide, you will:

  • Create a node pool configured to scale to zero
  • Deploy a test workload to trigger scale-up from zero
  • Observe automatic scale-down after the workload is removed
  • Learn how to troubleshoot common scale-down blockers

Note: Scale-to-zero is only available on ROSA HCP clusters. It is not supported on ROSA Classic.

Important: ROSA HCP currently requires a minimum of 2 non-tainted worker nodes per cluster at all times. The OCM API enforces this by requiring the sum of min_replica across all non-tainted node pools to be at least 2. You cannot scale all node pools to zero — scale-to-zero is intended for additional workload pools (e.g. burst capacity, dev/test), while your base pools maintain the minimum worker capacity for system pods (ingress, monitoring, registry). See Minimum Replica Constraint for details.

Prerequisites

  • A ROSA HCP cluster running OpenShift 4.x
  • oc CLI logged in to the cluster
  • rosa CLI logged in (rosa login)
  • ocm CLI logged in (ocm login)
  • jq installed

Set Up Environment Variables

  1. Set the cluster name and retrieve the cluster ID.

  2. List the available subnets and choose one for the node pool.

    Set the subnet ID for the availability zone you want the node pool in:

Create a Node Pool with Scale-to-Zero

Scale-to-zero cannot be configured via the ROSA CLI or the Red Hat console today. You must use the OCM API or Terraform.

Option A: OCM API

  1. Create a new node pool with min_replica=0.

    Note: The OCM API uses singular min_replica / max_replica for HCP node pools. The taint prevents system pods from landing on this node pool, which avoids scale-down blockers (explained in Troubleshooting ).

  2. Verify the node pool was created and has 0 current replicas.

    You should see:

Option B: Terraform

  1. Create a main.tf file with the following content.

    Note: The Terraform RHCS provider uses plural min_replicas / max_replicas.

  2. Apply the configuration.

Enable Scale-to-Zero on an Existing Node Pool

You can also enable scale-to-zero on an existing node pool by patching it via the OCM API:

Note: This will only be accepted if the minimum replica constraint is still satisfied after the change.

Deploy a Test Workload to Trigger Scale-Up

With the node pool at 0 nodes, deploy a workload that targets it. The cluster autoscaler will detect unschedulable pods and provision a new node.

  1. Create a test namespace.

  2. Deploy an application with a nodeSelector and tolerations matching the node pool.

  3. Watch the pods. They will initially be Pending because no nodes match the selector.

    You should see:

  4. Check the cluster autoscaler status to confirm scale-up has been triggered.

    You should see:

    You can also check cluster events for the scale-up trigger:

    You should see an event like:

  5. In a separate terminal, watch for the new node to appear.

    After a few minutes, you should see a new node become Ready and the pods transition to Running.

  6. Confirm the pods are running on the new node.

Observe Scale-Down to Zero

When the workload is removed, the cluster autoscaler will detect the node as idle and eventually remove it.

  1. Delete the test deployment.

  2. Watch the node being removed.

    The scale-down process has two phases:

    Phase Duration Description
    Idle assessment ~15 minutes Autoscaler continuously observes the node as unneeded before triggering removal
    Drain + removal ~2 minutes Pod eviction, node drain, and EC2 instance termination

    The total time from workload deletion to node removal is typically ~17 minutes.

    Note: This was verified by deleting a workload from a node that had been running for over 15 minutes (ensuring any delay_after_add cooldown had fully expired). The ~15-minute idle assessment is a ROSA HCP platform-managed default and cannot be changed by the user.

  3. Verify the node pool has scaled back to 0.

Cluster Autoscaler Configuration

  1. View the current autoscaler configuration.

  2. On ROSA HCP, the scale-down behavior uses platform-managed defaults:

    Parameter Observed Default Description
    Idle assessment ~15 minutes How long a node must be continuously idle before removal is triggered
    Drain + removal ~2 minutes Time to evict pods and terminate the EC2 instance

    Note: On ROSA HCP, the scale_down parameters (such as unneeded_time, delay_after_add, utilization_threshold) cannot be customized. The OCM API rejects scale_down configuration changes with "Attribute 'scale_down' is not allowed". These settings are only configurable on ROSA Classic clusters.

Minimum Replica Constraint

The OCM API enforces that the sum of min_replica across all non-tainted node pools must be at least 2. This guarantees that system pods (ingress, monitoring, image registry) always have worker nodes available.

For example, given a cluster with three non-tainted pools (compute-0, compute-1, compute-2) each at min_replica=1:

  1. Setting compute-0 to min_replica=0 succeeds — the remaining pools still guarantee 2 replicas (1+1=2).

  2. Setting compute-1 to min_replica=0 fails — only compute-2 at min_replica=1 would remain, which is less than 2.

Key points:

  • Tainted node pools (e.g. pools with NoSchedule taints) are excluded from the count — system pods cannot schedule on them.
  • The constraint checks the sum of min_replica values, not the current number of running nodes.
  • You can set as many tainted pools to min_replica=0 as you want — only non-tainted pools are subject to this rule.
Pool Config Non-tainted min_replica sum Accepted?
3 pools at min=1 3 Yes
1 pool at min=0, 2 pools at min=1 2 Yes
2 pools at min=0, 1 pool at min=1 1 No
3 pools at min=0 0 No

Troubleshooting

Check Why a Node Is Not Scaling Down

  1. Check the cluster autoscaler status.

    Look for ScaleDown status:

    • CandidatesPresent — the autoscaler has identified nodes to scale down and is waiting out the idle timer.
    • NoCandidates — something is preventing scale-down. Investigate further below.
  2. Check for pods with safe-to-evict=false annotation, which blocks scale-down.

    If any pods appear, they are preventing the autoscaler from draining their node. Common culprits include operators like OpenShift Pipelines (Tekton), which sets this annotation on its controller pods.

    To override the annotation on a specific pod:

    Note: If the pod is managed by an operator, the annotation will be reset when the pod is recreated. Consider uninstalling the operator or configuring it to not set this annotation.

System Pods and PodDisruptionBudgets

System pods with PodDisruptionBudgets (PDBs) such as router-default, image-registry, and alertmanager generally do not block scale-down. The autoscaler respects PDBs during its simulation and will proceed if the pods can be safely rescheduled to other nodes.

Anti-Affinity Cascade Locks

If you have multiple node pools that can scale to zero and many system pods with hard pod anti-affinity rules (requiredDuringSchedulingIgnoredDuringExecution), a circular dependency can form. When nodes scale down, evicted system pods land on remaining nodes — including workload-designated pools — and their anti-affinity rules can prevent further scale-down.

To avoid this:

  • Use taints and tolerations on workload pools to prevent system pods from landing on them (as shown in this guide).
  • Keep at least one general-purpose pool with min_replicas >= 1 that is large enough to host all system pods.
  • The minimum replica constraint still applies — ensure your non-tainted pools still guarantee at least 2 replicas.

Cleanup

  1. Delete the test namespace.

  2. Delete the node pool.

    Using OCM API:

    Or using Terraform:

References

Back to top

Interested in contributing to these docs?

Collaboration drives progress. Help improve our documentation The Red Hat Way.

Red Hat logo LinkedIn YouTube Facebook Twitter

Products

Tools

Try, buy & sell

Communicate

About Red Hat

We’re the world’s leading provider of enterprise open source solutions—including Linux, cloud, container, and Kubernetes. We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Subscribe to our newsletter, Red Hat Shares

Sign up now
© 2026 Red Hat