Training a lightweight text classification model on Red Hat OpenShift AI on ROSA and storing model artifacts in Amazon S3
This content is authored by Red Hat experts, but has not yet been tested on every supported configuration. This guide has been validated on OpenShift 4.21. Operator CRD names, API versions, and console paths may differ on other versions.
Red Hat OpenShift AI has evolved significantly in recent releases. In this walkthrough, we install OpenShift AI on ROSA, disable unnecessary KServe-related dependencies for a notebook-only workflow, create an Amazon S3 connection, launch a workbench, train a lightweight text classification model, and upload the resulting model artifacts to Amazon S3.
Note that this article does not cover model serving. Starting with Red Hat OpenShift AI 2.25, the KServe Serverless deployment mode is deprecated . KServe RawDeployment remains available and uses fewer dependencies.
0. Prerequisites
Before you start, make sure you have:
- a ROSA cluster with cluster-admin level access
- the OpenShift CLI
ocinstalled - a default dynamic storage class on the cluster (typically already present on ROSA)
- an Amazon S3 bucket and AWS credentials, or permission to create them
- a machine pool with enough worker capacity for OpenShift AI
In our validation, a 2-worker m5.xlarge setup was not enough for this walkthrough because some OpenShift AI components could not be scheduled and the dashboard stayed in a Not Ready state. For this reason, plan for at least 3 worker nodes for OpenShift AI, enable autoscaling, or use a dedicated machine pool if your existing workers are already heavily used.
Verify that your cluster has healthy worker nodes and a default storage class:
1. Install the Red Hat OpenShift AI Operator
From the OpenShift web console:
- Go to Ecosystem -> Software Catalog (this is formerly known as OperatorHub)
- Search for Red Hat OpenShift AI
- Install the operator
- Create
DataScienceClusterwhen prompted (if it throws an error, wait a few minutes, refresh the page, and try again)
Alternatively, create DataScienceCluster using CLI:
On recent OpenShift AI releases, the default installation can get stuck before the DataScienceCluster is created because Service Mesh-related components are expected even though this walkthrough only uses workbenches and Amazon S3.
2. Remove the unnecessary serving dependencies
This guide does not use model serving, so remove the unnecessary serving-related dependencies first.
Patch the DSCI to remove Service Mesh:
Then patch the DSC to remove KServe:
Verify that the cluster becomes ready:
You want the output to become:
3. Create the Amazon S3 bucket
Create an S3 bucket in the AWS console or by using the AWS CLI. If you already have an S3 bucket and credentials available, you can skip this step and use your existing bucket instead.
For this walkthrough, I used rhoai-test-s3-bucket in ca-central-1, the same AWS region as the ROSA cluster. Using the same region keeps the example simple and avoids unnecessary cross-region configuration.
The regional Amazon S3 endpoint for ca-central-1 is:
4. Create a data science project
After OpenShift AI is installed and the DataScienceCluster is ready, open the OpenShift AI dashboard.
- Click Data science projects
- Click Create project
- Enter a project name, for example
project-s3 - Optionally add a description
- Click Create
5. Create a workbench and add the S3 connection
Inside the project you just created:
- Open the Workbenches tab
- Click Create workbench
- Enter a workbench name
- Select a Jupyter image
- Choose a modest size for this CPU-only validation
- In the Connections section, click Create connection
- Choose S3 compatible object storage - v1
- Enter the connection details (see below after workbench snippet)
- Click Create to create the connection
- Click Create workbench
- Wait for the workbench to become Running
- Click the workbench name to launch it in a new tab
For AWS S3 in ca-central-1, the values used in this walkthrough were:
- Connection name:
rhoai-test-s3-connection - Endpoint:
https://s3.ca-central-1.amazonaws.com - Region:
ca-central-1 - Bucket:
rhoai-test-s3-bucket
You will also need the AWS access key and secret key for the bucket.
6. Install the Python packages
Open a notebook cell in the workbench and run:
After the installation completes, restart the kernel before continuing.
7. Train a lightweight model and upload the artifacts to S3
In this walkthrough, we use distilbert-base-uncased, which is a safe and mainstream choice for a lightweight demo. This example also keeps the dataset intentionally small and uses one epoch so the walkthrough finishes in a reasonable time on CPU.
Before running this on the next cell, set the AWS_S3_BUCKET environment variable in your workbench if you want to use a bucket name other than the example default.
If training feels slow, feel free to reduce the dataset further:
8. Understanding the notebook output
You may see similar output to below snippet.
Note that some of the notebook output is expected and does not indicate a failure.
The Hugging Face warning simply means the model is being downloaded without an authentication token. The UNEXPECTED and MISSING entries in the DistilBERT load report are also expected when loading a base pretrained model for a sequence classification task.
The tokenization progress confirms that the dataset was processed successfully, and the pin_memory warning only indicates that the workbench is using CPU rather than GPU.
A line such as [7/7 05:01, Epoch 1/1] shows that all training steps completed and that the model finished 1 epoch, or 1 full pass through the dataset.
The metrics table reports the training loss, validation loss, and accuracy for this validation run. Because this walkthrough intentionally uses a very small dataset and a single epoch, the goal is to validate the workflow rather than optimize model quality.
Finally, output such as Uploaded files from ./model to s3://rhoai-test-s3-bucket/model/ confirms that the model artifacts were saved locally and uploaded successfully to Amazon S3.
To verify that, open the S3 bucket in the AWS console and confirm that the model/ prefix contains artifacts such as:
config.jsonmodel.safetensorstokenizer_config.jsontokenizer.json
That confirms the end-to-end workflow from OpenShift AI workbench to Amazon S3.