How to run and deploy LLMs using Red Hat OpenShift AI on a Red Hat OpenShift Service on AWS cluster

Learn how to install the Red Hat® OpenShift® AI (RHOAI) operator and Jupyter notebook, create an Amazon S3 bucket, and run the LLM model on a Red Hat OpenShift Service on AWS (ROSA) cluster.

Disclaimer: this content is authored by Red Hat experts, but has not yet been tested on every supported configuration.

Learn how to install the Red Hat® OpenShift® AI (RHOAI) operator and Jupyter notebook, create an Amazon S3 bucket, and run the LLM model on a Red Hat OpenShift Service on AWS (ROSA) cluster.

Disclaimer: this content is authored by Red Hat experts, but has not yet been tested on every supported configuration.

Installing Red Hat OpenShift AI and Jupyter notebook

20 mins

Before beginning this resource, please ensure that you have completed and meet the prerequisite steps. You will need access to your Red Hat® OpenShift® Service on AWS (ROSA) cluster and to the cluster console in order to proceed. 

What will you learn?

  • Installing the Red Hat OpenShift AI (RHOAI) operator
  • Installing Jupyter notebook

What do you need before starting?

  • Red Hat account
  • ROSA cluster
  • Console access

How to install Red Hat OpenShift AI and Jupyter notebook

From your cluster console, go to OperatorHub under Operators from the left tab, and put “Red Hat OpenShift AI” into the search query to install the operator. Note: The most recent version of the operator at the time of writing is 2.9.0.

Choose the default option for the installation. The operator will be installed in the redhat-ods-operator namespace, which will be created automatically upon installation.

Screenshot of OperatorHub in the console
RHOAI operator in the OperatorHub

 

Afterward, create a DataScienceCluster (DSC) instance. Once again, choose the default option for the installation. The name of the DSC instance will be default-dsc

Screenshot of DSC creation in the console
Prompt to create DSC after RHOAI operator installation

 

Below is what you should see once the instance is created:

Screenshot of created DSC instance
The created DSC instance displaying within the OpenShift console.

 

Next, go to your cluster console and click the 9-boxes icon on the upper right side (next to the bell/notification icon). Then, select Red Hat OpenShift AI to launch it in the next tab. 

Screenshot of RHOAI shortcut
RHOAI shortcut for ROSA within the 9-boxes menu.

 

Once launched, go to the server page and on the left tab, look under Applications and select Enabled. And then launch Jupyter to see the notebook options available to install.

In this case, choose TensorFlow 2024.1. Leave the size of the container set to small, which is the default. Finally, click the Start server button at the bottom. Note that if the server failed to start, then you might want to scale up your worker nodes.

Screenshot of RHOAI notebooks
RHOAI notebooks

 

The server installation will take several minutes. Once installed, you'll see the main page of your Jupyter notebook (reference the below example). 

Screenshot of successful installation prompt
RHOAI server success installation prompt

 

To start the next section, select a Python 3.9 notebook in a new tab.

Screenshot of Jupyter notebook launcher
Jupyter notebook launcher in a new tab

 

With that, you’re ready to move on to the next resource, where you’ll learn how to create and grant access to an Amazon S3 bucket. 

Previous resource
Prerequisites
Next resource
Creating S3 bucket

This learning path is for operations teams or system administrators.

Developers might want to check out how to create a natural language processing (NLP) application using Red Hat OpenShift AI on developers.redhat.com.

Get started on developers.redhat.com

Hybrid Cloud Logo LinkedIn YouTube Facebook Twitter

Products

Tools

Try, buy, sell

Communicate

About Red Hat

We’re the world’s leading provider of enterprise open source solutions—including Linux, cloud, container, and Kubernetes. We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.