Configure Node Pool Scale-to-Zero on ROSA HCP

Last edited: March 24, 2026
Published: March 21, 2026
Authors: Kevin Ye

Tags:

This content is authored by Red Hat experts, but has not yet been tested on every supported configuration.

ROSA HCP supports setting min_replicas=0 on node pools with autoscaling enabled. This allows the cluster autoscaler to scale worker nodes down to zero when no workloads require them, and scale back up automatically when pods are scheduled. This is useful for cost optimization on development, testing, or burst-capacity node pools.

In this guide, you will:

Create a node pool configured to scale to zero
Deploy a test workload to trigger scale-up from zero
Observe automatic scale-down after the workload is removed
Learn how to troubleshoot common scale-down blockers

Note: Scale-to-zero is only available on ROSA HCP clusters. It is not supported on ROSA Classic.

Important: ROSA HCP currently requires a minimum of 2 non-tainted worker nodes per cluster at all times. The OCM API enforces this by requiring the sum of min_replica across all non-tainted node pools to be at least 2. You cannot scale all node pools to zero — scale-to-zero is intended for additional workload pools (e.g. burst capacity, dev/test), while your base pools maintain the minimum worker capacity for system pods (ingress, monitoring, registry). See Minimum Replica Constraint for details.

Prerequisites

A ROSA HCP cluster running OpenShift 4.x
oc CLI logged in to the cluster
rosa CLI logged in (rosa login)
ocm CLI logged in (ocm login)
jq installed

Set Up Environment Variables

Set the cluster name and retrieve the cluster ID.
List the available subnets and choose one for the node pool.

Set the subnet ID for the availability zone you want the node pool in:

Create a Node Pool with Scale-to-Zero

Scale-to-zero cannot be configured via the ROSA CLI or the Red Hat console today. You must use the OCM API or Terraform.

Option A: OCM API

Create a new node pool with min_replica=0.

Note: The OCM API uses singular min_replica / max_replica for HCP node pools. The taint prevents system pods from landing on this node pool, which avoids scale-down blockers (explained in Troubleshooting ).
Verify the node pool was created and has 0 current replicas.

You should see:

Option B: Terraform

Create a main.tf file with the following content.

Note: The Terraform RHCS provider uses plural min_replicas / max_replicas.
Apply the configuration.

Enable Scale-to-Zero on an Existing Node Pool

You can also enable scale-to-zero on an existing node pool by patching it via the OCM API:

Note: This will only be accepted if the minimum replica constraint is still satisfied after the change.

Deploy a Test Workload to Trigger Scale-Up

With the node pool at 0 nodes, deploy a workload that targets it. The cluster autoscaler will detect unschedulable pods and provision a new node.

Create a test namespace.
Deploy an application with a nodeSelector and tolerations matching the node pool.
Watch the pods. They will initially be Pending because no nodes match the selector.

You should see:
Check the cluster autoscaler status to confirm scale-up has been triggered.

You should see:

You can also check cluster events for the scale-up trigger:

You should see an event like:
In a separate terminal, watch for the new node to appear.

After a few minutes, you should see a new node become Ready and the pods transition to Running.
Confirm the pods are running on the new node.

Observe Scale-Down to Zero

When the workload is removed, the cluster autoscaler will detect the node as idle and eventually remove it.

Delete the test deployment.

Watch the node being removed.

The scale-down process has two phases:

Phase	Duration	Description
Idle assessment	~15 minutes	Autoscaler continuously observes the node as unneeded before triggering removal
Drain + removal	~2 minutes	Pod eviction, node drain, and EC2 instance termination

The total time from workload deletion to node removal is typically ~17 minutes.

Note: This was verified by deleting a workload from a node that had been running for over 15 minutes (ensuring any delay_after_add cooldown had fully expired). The ~15-minute idle assessment is a ROSA HCP platform-managed default and cannot be changed by the user.

Verify the node pool has scaled back to 0.

Cluster Autoscaler Configuration

View the current autoscaler configuration.

On ROSA HCP, the scale-down behavior uses platform-managed defaults:

Parameter	Observed Default	Description
Idle assessment	~15 minutes	How long a node must be continuously idle before removal is triggered
Drain + removal	~2 minutes	Time to evict pods and terminate the EC2 instance

Note: On ROSA HCP, the scale_down parameters (such as unneeded_time, delay_after_add, utilization_threshold) cannot be customized. The OCM API rejects scale_down configuration changes with "Attribute 'scale_down' is not allowed". These settings are only configurable on ROSA Classic clusters.

Minimum Replica Constraint

The OCM API enforces that the sum of min_replica across all non-tainted node pools must be at least 2. This guarantees that system pods (ingress, monitoring, image registry) always have worker nodes available.

For example, given a cluster with three non-tainted pools (compute-0, compute-1, compute-2) each at min_replica=1:

Setting compute-0 to min_replica=0 succeeds — the remaining pools still guarantee 2 replicas (1+1=2).
Setting compute-1 to min_replica=0 fails — only compute-2 at min_replica=1 would remain, which is less than 2.

Key points:

Tainted node pools (e.g. pools with NoSchedule taints) are excluded from the count — system pods cannot schedule on them.
The constraint checks the sum of min_replica values, not the current number of running nodes.
You can set as many tainted pools to min_replica=0 as you want — only non-tainted pools are subject to this rule.

Pool Config	Non-tainted min_replica sum	Accepted?
3 pools at min=1	3	Yes
1 pool at min=0, 2 pools at min=1	2	Yes
2 pools at min=0, 1 pool at min=1	1	No
3 pools at min=0	0	No

Troubleshooting

Check Why a Node Is Not Scaling Down

Check the cluster autoscaler status.

Look for ScaleDown status:
- CandidatesPresent — the autoscaler has identified nodes to scale down and is waiting out the idle timer.
- NoCandidates — something is preventing scale-down. Investigate further below.
Check for pods with safe-to-evict=false annotation, which blocks scale-down.

If any pods appear, they are preventing the autoscaler from draining their node. Common culprits include operators like OpenShift Pipelines (Tekton), which sets this annotation on its controller pods.

To override the annotation on a specific pod:

Note: If the pod is managed by an operator, the annotation will be reset when the pod is recreated. Consider uninstalling the operator or configuring it to not set this annotation.

System Pods and PodDisruptionBudgets

System pods with PodDisruptionBudgets (PDBs) such as router-default, image-registry, and alertmanager generally do not block scale-down. The autoscaler respects PDBs during its simulation and will proceed if the pods can be safely rescheduled to other nodes.

Anti-Affinity Cascade Locks

If you have multiple node pools that can scale to zero and many system pods with hard pod anti-affinity rules (requiredDuringSchedulingIgnoredDuringExecution), a circular dependency can form. When nodes scale down, evicted system pods land on remaining nodes — including workload-designated pools — and their anti-affinity rules can prevent further scale-down.

To avoid this:

Use taints and tolerations on workload pools to prevent system pods from landing on them (as shown in this guide).
Keep at least one general-purpose pool with min_replicas >= 1 that is large enough to host all system pods.
The minimum replica constraint still applies — ensure your non-tainted pools still guarantee at least 2 replicas.

Cleanup

Delete the test namespace.
Delete the node pool.

Using OCM API:

Or using Terraform:

Configure Node Pool Scale-to-Zero on ROSA HCP

Prerequisites

Set Up Environment Variables

Create a Node Pool with Scale-to-Zero

Option A: OCM API

Option B: Terraform

Enable Scale-to-Zero on an Existing Node Pool

Deploy a Test Workload to Trigger Scale-Up

Observe Scale-Down to Zero

Cluster Autoscaler Configuration

Minimum Replica Constraint

Troubleshooting

Check Why a Node Is Not Scaling Down

System Pods and PodDisruptionBudgets

Anti-Affinity Cascade Locks

Cleanup

References

Interested in contributing to these docs?

Products

Tools

Try, buy & sell

Communicate

About Red Hat

Subscribe to our newsletter, Red Hat Shares

Red Hat legal and privacy links

Red Hat legal and privacy links