Getting Started with Red Hat Build of Karpenter (AutoNode) on ROSA

Last edited: June 13, 2026
Published: June 11, 2026
Authors: Kevin Collins,; Kumudu Herath

Tags:

This content is authored by Red Hat experts, but has not yet been tested on every supported configuration. This guide has been validated on OpenShift 4.22. Operator CRD names, API versions, and console paths may differ on other versions.

Red Hat build of Karpenter (AutoNode) brings workload-aware, just-in-time node provisioning to Red Hat OpenShift Service on AWS (ROSA) with Hosted Control Planes. Instead of managing static machine pools with pre-defined instance types, Karpenter evaluates the exact CPU, memory, and scheduling constraints of pending pods and provisions the optimal EC2 instance automatically — then consolidates underutilized nodes when they are no longer needed.

This guide walks through enabling AutoNode on a ROSA HCP cluster, configuring a NodePool and EC2NodeClass, and exploring use cases including right-sizing, Spot optimization, and consolidation.

Prerequisites

A ROSA HCP cluster running OpenShift 4.22 or later with AutoNode enabled
oc CLI authenticated to the cluster
rosa CLI configured
AWS CLI configured

Set Environment Variables

Set your cluster name once and reuse it throughout the guide:

Deploy a Karpenter-Enabled ROSA Cluster

Option 1 — Automated (Recommended)

Use the terraform-rosa Terraform module to deploy a fully configured ROSA HCP cluster with AutoNode enabled in a single command. Set karpenter = true alongside your cluster variables and Terraform handles the IAM role, trust policy, cluster wiring, and default NodePool/EC2NodeClass automatically.

Set the required environment variables:

Create a tfvars file:

Deploy:

Terraform will create the cluster, configure the Karpenter IAM role, and apply the default OpenshiftEC2NodeClass and NodePool automatically.

Option 2 — Manual

Follow the official Red Hat documentation to:

Create the Karpenter IAM policy and role
Tag the cluster security group with karpenter.sh/discovery
Enable AutoNode via rosa edit cluster --autonode=enabled --autonode-iam-role-arn=<role_arn>

Verify AutoNode is Active

Expected output:

Configure NodePool and EC2NodeClass

ROSA uses OpenshiftEC2NodeClass instead of the upstream EC2NodeClass. ROSA automatically manages subnet and security group selectors via karpenter.sh/discovery tags — no manual configuration is needed in the spec.

Note: If you deployed via terraform-rosa with karpenter = true, these resources are already applied. Skip to Use Case 1 .

Apply the OpenshiftEC2NodeClass:

Apply the NodePool:

Verify both resources are ready:

Expected output:

NODES: 0 is correct — Karpenter provisions nodes on demand when pods are pending.

Create the Test Namespace

All workloads run in a dedicated namespace:

Use Case 1 — Basic Scale-Up

Deploy a workload that exceeds current capacity and watch Karpenter provision a right-sized node automatically.

Watch Karpenter respond:

What to observe:

Pods enter Pending state — no capacity available on existing nodes
Within ~30 seconds, Karpenter detects pending pods and creates a NodeClaim
A new node joins the cluster (~2–4 minutes)
All pods schedule and move to Running

Karpenter evaluated the total pending resource requests (10 × 1 CPU / 1Gi) and provisioned a single right-sized instance through bin-packing rather than multiple smaller nodes.

Use Case 2 — Instance Type Flexibility (Right-Sizing)

Show how Karpenter selects different instance families for memory-heavy vs CPU-heavy workloads.

Important: Resource requests must be large enough that workloads cannot efficiently share a single node. Karpenter always optimizes for cost — small requests will be bin-packed onto one large instance instead of provisioning specialized nodes. topologySpreadConstraints forces pods to spread across separate nodes.

Deploy a memory-heavy workload (12Gi per pod → drives r-family selection):

Deploy a CPU-heavy workload (6 CPU per pod → drives c-family selection):

After nodes provision (~3–4 minutes):

The memory workload lands on r-family instances; the CPU workload lands on c-family instances — no manual node group configuration required.

Use Case 3 — Spot Instance Optimization

Show cost savings through automatic Spot instance usage.

Spot instances can deliver 60–90% cost savings vs On-Demand. Karpenter monitors EC2 Spot markets across instance types and Availability Zones to find the cheapest available capacity, with automatic fallback to On-Demand when Spot is unavailable.

Use Case 4 — Consolidation (Scale Down)

Show Karpenter automatically reclaiming unused capacity.

Within ~60 seconds Karpenter identifies underutilized nodes, cordons and drains them, reschedules remaining pods onto fewer nodes, and terminates the unused EC2 instances.

Use Case 5 — Coexistence with Machine Pools

Karpenter-managed nodes and existing ROSA machine pool nodes run side by side in the same cluster. You can use node selectors and affinity rules to direct specific workloads to either provisioner. This enables a gradual migration — existing workloads stay on managed machine pools while new workloads adopt Karpenter at your own pace.

View existing machine pools

Optionally enable Cluster Autoscaler on a machine pool

To compare Karpenter with traditional Cluster Autoscaler scaling, enable autoscaling on an existing machine pool:

Verify autoscaling is enabled:

Deploy a workload targeting Karpenter nodes

Karpenter-provisioned nodes carry the autonode: "true" label from the NodePool template. Use a nodeSelector to direct workloads exclusively to these nodes:

Deploy a workload targeting machine pool nodes only

Use node affinity to ensure the workload never schedules on Karpenter-managed nodes. The replica count and CPU request are sized to exceed the available capacity of the existing machine pool nodes, which forces the Cluster Autoscaler to provision additional machine pool nodes:

Verify the Cluster Autoscaler scaled the machine pool

Watch for new nodes and confirm they came from the machine pool (no autonode label) rather than Karpenter:

Once new nodes appear, verify the pods scheduled correctly:

Confirm the machine pool replica count increased:

What to observe: Pods targeting machine pool nodes go Pending because existing nodes are full. The Cluster Autoscaler detects the unschedulable pods, scales the machine pool up, and the new nodes carry standard machine pool labels — no autonode label, no karpenter.sh/nodepool label. This confirms the two provisioners are operating independently on the same cluster.

Verify workload placement side by side

Expected result — two distinct groups of nodes:

Node	`autonode`	Instance Type	Capacity Type	Provisioner
`ip-10-x-x-x`	`true`	`c7i-flex.2xlarge`	`spot`	Karpenter
`ip-10-x-x-x`	(none)	`m5.xlarge`	(none)	Cluster Autoscaler

Pods from karpenter-only will appear on nodes with autonode=true; pods from machinepool-only will appear on machine pool nodes with no autonode label.

Cleanup

After the namespace is deleted, both provisioners will reclaim their nodes automatically. Karpenter terminates its nodes within ~30 seconds of the workloads being removed (based on the consolidateAfter: 30s setting in the NodePool). The Cluster Autoscaler will scale the machine pool back down to its minimum replica count within a few minutes once the nodes are no longer needed.

Summary

Capability	What It Shows	Business Value
Right-sizing	CPU vs memory workloads get different instance families	No over-provisioning; pay only for what you need
Spot optimization	Batch workloads automatically use Spot	60–90% cost reduction for fault-tolerant workloads
Consolidation	Scale down → nodes disappear in ~60s	No stranded capacity; cluster continuously optimizes
Zero overhead	No Karpenter pods in `oc get pods -A`	Hosted control plane takes the operational burden
Coexistence	Machine pool + Karpenter nodes side by side	Gradual migration, no big-bang cutover required
400+ instance types	`oc get nodes -L node.kubernetes.io/instance-type` shows variety	No manual node group configuration per instance type