Creating Agentic AI to deploy ARO cluster using Terraform with Red Hat OpenShift AI on ROSA and Amazon Bedrock
This content is authored by Red Hat experts, but has not yet been tested on every supported configuration.
1. Introduction
Agentic AI can be defined as systems that are capable of interpreting natural language instructions, in this case users’ prompts, making decisions based on those prompts, and then autonomously executing tasks on behalf of users. In this guide, we will create one that is intelligent enough that not only that it can understand/parse users’ prompts, but it can also take action upon it by deploying (and destroying) Azure Red Hat OpenShift (ARO) cluster using Terraform.
Terraform is an automation tool, sometimes referred to as an Infrastructure as Code (IaC) tool, that allows us to provision infrastructure using declarative configuration files. The agentic AI in this guide will provision those clusters based on our MOBB Terraform repository for ARO . Here it runs on Red Hat OpenShift AI (RHOAI) , which is our platform for managing AI/ML projects lifecycle, running on a Red Hat OpenShift Service on AWS (ROSA) cluster. In addition, we will be using Anthropic’s Claude Sonnet 3 model via Amazon Bedrock .
In short, the objective of this guide to introduce you to what I’d like to call Prompt-based Infrastructure or perhaps, Text-to-Terraform. That said, the agentic AI we are creating will be able to deploy (and destroy) ARO cluster based on users’ prompts such as whether it is private/public, which region, what types of worker nodes, number of worker nodes, which cluster version, and so forth. I will specify the prompts’ parameters in the relevant sections and highlight the differences between the default parameters in this guide and in the Terraform repository.
Note that since real deployment could be costly, I set up simulator test with mock toggle that you can set to True for mock results and False for real cluster deployment.
As usual, before we move forward, kindly note on the disclaimers below.
Disclaimers: Note that this guide references Terraform repositories that are actively maintained by MOBB team and may change over time. Always check the repository documentation for the latest syntax, variables, and best practices before deployment. In addition, when using this agentic AI, please be aware that while the system is designed to interpret natural language instructions and autonomously execute infrastructure configurations, it is not infallible. The agentic AI may occasionally misinterpret requirements or generate suboptimal configurations. It is your responsibility to review all generated Terraform configurations before applying them to your cloud environment. Neither the author of this implementation nor the service providers can be held responsible for any unexpected infrastructure deployments, service disruptions, or cloud usage charges resulting from configurations executed by the agentic AI. Lastly, please note that user interfaces may change over time as the products evolve. Some screenshots and instructions may not exactly match what you see.
2. Prerequisites
- I tested this on an HCP ROSA 4.18.14 with
m5.8xlargeinstance size for the worker nodes.
- Amazon Bedrock
- You could use any model of your choice via Amazon Bedrock, but in this guide, we’ll use Anthropic Claude 3 Sonnet, so if you have not already, please proceed to your AWS Console and be sure that you enable the model (or the model of your choice) and that your account has the right permissions for Amazon Bedrock.
- RHOAI operator
- You can install it using console per Section 3 in this tutorial or using CLI per Section 3 in this tutorial .
- Once you have the operator installed, be sure to install
DataScienceClusterinstance, wait for a few minute for the changes to take effect, and then launch the RHOAI dashboard for next step. - I tested this tutorial using RHOAI version 2.19.0.
3. Setup
First, we will create the setup file and to do so, you would need your Azure credentials such as your Azure Client ID, Azure Client Secret, Azure Tenant ID, and Azure Subscription ID.
You can retrieve these credentials via Azure Portal or via az cli, the latter being the easier one. To do so via cli, first run:
From the output, use the id value as your Azure Subscription ID, and tenantId as your Azure Tenant ID.
As for the other credentials, you can either use the existing Service Principal or create a new one. For the latter, run the following (give it a proper name and replace the subscription ID):
From the output, take the appId as your Azure Client ID and the password as your Azure Secret ID. Keep all these credentials handy for the this step.
The setup here essentially is an environment bootstrapping module that handles dependency installation (Terraform, Azure CLI, boto3), configures Azure service principal credentials, validates AWS IAM permissions for Bedrock access, and ensures the execution environment is properly initialized.
On the RHOAI dashboard, launch a Jupyter notebook instance. In this example, we will be using TensorFlow 2025.1 image with Medium container size for the notebook. This might take a few minutes to provision.
And once the notebook is ready, go to the File tab on the upper left and choose New, and select Python File. Copy the lines below, save and rename the file as setup_aro_bedrock.py. Replace the env vars with your Azure credentials.
4. Bedrock parser
Next, let’s create the parser which acts as the Natural Language Processing interface using AWS Bedrock’s foundation model, in this case Claude 3 Sonnet model, to extract structured parameters from unstructured text. That way, the agent will understand our prompts intelligently and converts it into technical parameters the system can use.
Here we also set up the default parameters if not specified in the user prompts such as cluster name, region, worker node size, worker node count, version, and private/public. Note that some of these default parameters are slightly different from the Terraform repository. For example, the default region here is westus and the default cluster name is agentic-aro, so feel free to adjust these parameters accordingly. Also note that it will spin up the latest version if users do not specify it in the prompt.
Create new Python file called parser_bedrock.py and copy the below code and save it.
5. Deployment agent
Now, we will create the ARO deployment agent which essentially is the wrapper that dynamically generates Terraform configurations for ARO deployment. It also manages the complete lifecycle including state management, the Azure authentication, and resource provisioning.
Copy the lines below and save it as aro.py.
6. Simulator
Next, we will be creating the orchestrator or the brain of the agent. This simulator is a high-level orchestration layer implementing a state machine for deployment/deletion operations and it handles whether we ask it for either mock or real deployment (and destruction).
Copy the lines below and save it as aro_bedrock_simulator.py.
7. Notebook integration
Finally, let’s create the notebook where we will run our prompts. Note that here you would need your AWS credentials such as AWS Access Key ID and AWS Secret Access Key to enable Amazon Bedrock.
On your notebook console, go to the File tab on the upper left and choose New, and select Notebook. Copy the lines below into one cell and save it as aro_notebook.ipynb. Replace the credentials placeholder with your AWS credentials.
Here the mock toggle above is currently set to True so if you run this notebook, it will only show you how it looks like if it runs like below.

And if you set it to False for real deployment, it will look like the following.

Note that every notebook runs will keep track Terraform state in its own folder/directory, so if you want to create another cluster at the same time, simply duplicate the notebook and run the new one.
8. Future research
There are many things that we can improve for this guide and/or for future guide. Since we’re doing a simple demonstration here, we’re hardcoding the credentials on the notebook, which does not actually follow security’s best practice, so it would be great to use IRSA instead. It would also be nice if we can spin up (and destroy) not only ARO cluster, but also ROSA cluster. And on that note, it would also be interesting to create similar agent in an ARO cluster using Azure OpenAI’s model. And then for production use case, ideally you would want to integrate this with Slack or Microsoft Teams or something similar to make it more user-friendly and more accessible to non-technical users.