Configure Node Pool Scale-to-Zero on ROSA HCP
This content is authored by Red Hat experts, but has not yet been tested on every supported configuration.
ROSA HCP supports setting min_replicas=0 on node pools with autoscaling enabled. This allows the cluster autoscaler to scale worker nodes down to zero when no workloads require them, and scale back up automatically when pods are scheduled. This is useful for cost optimization on development, testing, or burst-capacity node pools.
In this guide, you will:
- Create a node pool configured to scale to zero
- Deploy a test workload to trigger scale-up from zero
- Observe automatic scale-down after the workload is removed
- Learn how to troubleshoot common scale-down blockers
Note: Scale-to-zero is only available on ROSA HCP clusters. It is not supported on ROSA Classic.
Important: ROSA HCP currently requires a minimum of 2 non-tainted worker nodes per cluster at all times. The OCM API enforces this by requiring the sum of
min_replicaacross all non-tainted node pools to be at least 2. You cannot scale all node pools to zero — scale-to-zero is intended for additional workload pools (e.g. burst capacity, dev/test), while your base pools maintain the minimum worker capacity for system pods (ingress, monitoring, registry). See Minimum Replica Constraint for details.
Prerequisites
- A ROSA HCP cluster running OpenShift 4.x
ocCLI logged in to the clusterrosaCLI logged in (rosa login)ocmCLI logged in (ocm login)jqinstalled
Set Up Environment Variables
-
Set the cluster name and retrieve the cluster ID.
-
List the available subnets and choose one for the node pool.
Set the subnet ID for the availability zone you want the node pool in:
Create a Node Pool with Scale-to-Zero
Scale-to-zero cannot be configured via the ROSA CLI or the Red Hat console today. You must use the OCM API or Terraform.
Option A: OCM API
-
Create a new node pool with
min_replica=0.Note: The OCM API uses singular
min_replica/max_replicafor HCP node pools. The taint prevents system pods from landing on this node pool, which avoids scale-down blockers (explained in Troubleshooting ). -
Verify the node pool was created and has 0 current replicas.
You should see:
Option B: Terraform
-
Create a
main.tffile with the following content.Note: The Terraform RHCS provider uses plural
min_replicas/max_replicas. -
Apply the configuration.
Enable Scale-to-Zero on an Existing Node Pool
You can also enable scale-to-zero on an existing node pool by patching it via the OCM API:
Note: This will only be accepted if the minimum replica constraint is still satisfied after the change.
Deploy a Test Workload to Trigger Scale-Up
With the node pool at 0 nodes, deploy a workload that targets it. The cluster autoscaler will detect unschedulable pods and provision a new node.
-
Create a test namespace.
-
Deploy an application with a
nodeSelectorandtolerationsmatching the node pool. -
Watch the pods. They will initially be
Pendingbecause no nodes match the selector.You should see:
-
Check the cluster autoscaler status to confirm scale-up has been triggered.
You should see:
You can also check cluster events for the scale-up trigger:
You should see an event like:
-
In a separate terminal, watch for the new node to appear.
After a few minutes, you should see a new node become
Readyand the pods transition toRunning. -
Confirm the pods are running on the new node.
Observe Scale-Down to Zero
When the workload is removed, the cluster autoscaler will detect the node as idle and eventually remove it.
-
Delete the test deployment.
-
Watch the node being removed.
The scale-down process has two phases:
Phase Duration Description Idle assessment ~15 minutes Autoscaler continuously observes the node as unneeded before triggering removal Drain + removal ~2 minutes Pod eviction, node drain, and EC2 instance termination The total time from workload deletion to node removal is typically ~17 minutes.
Note: This was verified by deleting a workload from a node that had been running for over 15 minutes (ensuring any
delay_after_addcooldown had fully expired). The ~15-minute idle assessment is a ROSA HCP platform-managed default and cannot be changed by the user. -
Verify the node pool has scaled back to 0.
Cluster Autoscaler Configuration
-
View the current autoscaler configuration.
-
On ROSA HCP, the scale-down behavior uses platform-managed defaults:
Parameter Observed Default Description Idle assessment ~15 minutes How long a node must be continuously idle before removal is triggered Drain + removal ~2 minutes Time to evict pods and terminate the EC2 instance Note: On ROSA HCP, the
scale_downparameters (such asunneeded_time,delay_after_add,utilization_threshold) cannot be customized. The OCM API rejectsscale_downconfiguration changes with"Attribute 'scale_down' is not allowed". These settings are only configurable on ROSA Classic clusters.
Minimum Replica Constraint
The OCM API enforces that the sum of min_replica across all non-tainted node pools must be at least 2. This guarantees that system pods (ingress, monitoring, image registry) always have worker nodes available.
For example, given a cluster with three non-tainted pools (compute-0, compute-1, compute-2) each at min_replica=1:
-
Setting
compute-0tomin_replica=0succeeds — the remaining pools still guarantee 2 replicas (1+1=2). -
Setting
compute-1tomin_replica=0fails — onlycompute-2atmin_replica=1would remain, which is less than 2.
Key points:
- Tainted node pools (e.g. pools with
NoScheduletaints) are excluded from the count — system pods cannot schedule on them. - The constraint checks the sum of
min_replicavalues, not the current number of running nodes. - You can set as many tainted pools to
min_replica=0as you want — only non-tainted pools are subject to this rule.
| Pool Config | Non-tainted min_replica sum | Accepted? |
|---|---|---|
| 3 pools at min=1 | 3 | Yes |
| 1 pool at min=0, 2 pools at min=1 | 2 | Yes |
| 2 pools at min=0, 1 pool at min=1 | 1 | No |
| 3 pools at min=0 | 0 | No |
Troubleshooting
Check Why a Node Is Not Scaling Down
-
Check the cluster autoscaler status.
Look for
ScaleDownstatus:CandidatesPresent— the autoscaler has identified nodes to scale down and is waiting out the idle timer.NoCandidates— something is preventing scale-down. Investigate further below.
-
Check for pods with
safe-to-evict=falseannotation, which blocks scale-down.If any pods appear, they are preventing the autoscaler from draining their node. Common culprits include operators like OpenShift Pipelines (Tekton), which sets this annotation on its controller pods.
To override the annotation on a specific pod:
Note: If the pod is managed by an operator, the annotation will be reset when the pod is recreated. Consider uninstalling the operator or configuring it to not set this annotation.
System Pods and PodDisruptionBudgets
System pods with PodDisruptionBudgets (PDBs) such as router-default, image-registry, and alertmanager generally do not block scale-down. The autoscaler respects PDBs during its simulation and will proceed if the pods can be safely rescheduled to other nodes.
Anti-Affinity Cascade Locks
If you have multiple node pools that can scale to zero and many system pods with hard pod anti-affinity rules (requiredDuringSchedulingIgnoredDuringExecution), a circular dependency can form. When nodes scale down, evicted system pods land on remaining nodes — including workload-designated pools — and their anti-affinity rules can prevent further scale-down.
To avoid this:
- Use taints and tolerations on workload pools to prevent system pods from landing on them (as shown in this guide).
- Keep at least one general-purpose pool with
min_replicas >= 1that is large enough to host all system pods. - The minimum replica constraint still applies — ensure your non-tainted pools still guarantee at least 2 replicas.
Cleanup
-
Delete the test namespace.
-
Delete the node pool.
Using OCM API:
Or using Terraform: