AI / ML
This content is authored by Red Hat experts, but has not yet been tested on every supported configuration.
AI / ML Topics:
Deploying and Running Ollama and Open WebUI in a ROSA Cluster with GPUs
Red Hat OpenShift Service on AWS (ROSA) provides a managed OpenShift environment that can leverage AWS GPU instances. This guide will walk you through deploying Ollama and OpenWebUI on ROSA using instances with GPU for inferences.
Prerequisites
- A Red Hat OpenShift on AWS (ROSA classic or HCP) 4.14+ cluster
- OC CLI (Admin access to cluster)
- ROSA CLI
Set up GPU-enabled Machine Pool
First we need to check availability of our instance type used here (g4dn.xlarge), it should be in same region of the cluster. Note you can use also Graviton based instance (ARM64) like g5g* but only on HCP 4.16+ cluster.