With the release of Red Hat Advanced Cluster Management for Kubernetes (RHACM) version 2.3, Ansible integration is supported in the governance lifecycle of the product. This feature provides a way for you to configure an Ansible Tower job to be run in Ansible Tower, when a policy violation occurs on one or more managed clusters. This blog provides an example of how this integration works and how you can use it in your environment.
The scenario to be explored in this blog post is a web application that runs on Red Hat OpenShift Container Platform (RHOCP) and requires a SSL/TLS certificate for HTTPS connections. The certificate is stored as an RHOCP secret, and is mounted in the container for the Apache web server to use. Since certificates often expire in relatively short time frames, it is critical for you to be notified about the certificate expiration. With the usage of a certificate policy in RHACM, the expiration of the SSL/TLS certificate can be monitored and Ansible automation can be associated with it. Visit the blog, How to use the Certificate Policy Controller to Identify Risks in Red Hat Advanced Cluster Management for Kubernetes, to learn more about certificate policies.
In more detail, I share the configuration of a
CertificatePolicy resource, where a policy violation is reported when the SSL/TLS certificate is set to expire within 30 days. The policy violation that is listed in the RHACM console initiates an Ansible Tower job to create a ServiceNow incident (i.e. ticket), notifying you of the impending SSL/TLS certificate expiration. From there, you can renew the SSL/TLS certificate of the web application and update the RHOCP secret with the renewed SSL/TLS certificate. Though this is not explored in this blog post, Ansible can also be used to perform the renewal and replacement of the certificate automatically.
Please note that many of these steps can be performed with either the command line interface (CLI) or from the console. This blog uses a mixture based on whichever is easier to demonstrate.
The following is required to perform the actions in this blog post:
- A recent version of OpenShift Container Platform 4 or later.
- An installation of Red Hat Advanced Cluster Management 2.3.
- The Ansible Automation Platform Resource Operator installed using the OperatorHub.
- A recent version of Ansible Tower 3.
- Access to ServiceNow to request a development instance for free.
Even if you don’t have access to the previously mentioned requirements, continue to follow along to learn how this all works.
Setting Up the Demo Application
As stated previously, let's walk through creating a demo web application that serves HTTPS using a custom SSL/TLS certificate. From the CLI, create a self-signed SSL/TLS certificate (not recommended for production). The created SSL/TLS certificate is expected to expire 25 days from now, which is necessary to initiate a RHACM policy violation later on. Your SSL/TLS certificate may be created using the following commands:
openssl req \
-newkey rsa:4096 \
-days 25 \
-subj "/C=US/ST=NC/L=Raleigh/O=Example/CN=www.example.com" \
-keyout tls/tls.key \
From the CLI, log in to an RHOCP cluster that is either managed by the RHACM hub cluster, or is a managed cluster. For this example, I use the hub cluster. Then create an RHOCP namespace for the demo application to reside in. Run the following commands:
oc create ns acm-grc-ansible-example
Next, create an RHOCP secret to contain the self-signed SSL/TLS certificate previously generated. Notice that the
tls.crt secret key name is used. This is because it is the default key name that RHACM checks for when checking for
CertificatePolicy violations. To change the secret key name, see the Updating certificate policies documentation.
oc -n acm-grc-ansible-example create secret generic certs \
Now it’s time to deploy the example web application. Run the following command to create the RHOCP objects:
oc -n acm-grc-ansible-example apply -f \
After you run this command, the following RHOCP objects are created:
ImageStreamthat points to the quay.io/centos7/httpd-24-centos7 container image.
DeploymentConfigthat creates a container from the
ImageStreamwith the certificate mounted and using the previously created secret.
Service, which exposes port 8443 in the container as port 443.
Routethat points to the
Note: In a production use-case, the
Routeshould be configured with a custom, fully-qualified domain name that matches the SSL/TLS certificate, but to simplify things, let's use the fully-qualified domain name that is generated by RHOCP.
To verify that the web application is deployed and using the self-signed certificate, run the following commands. The commands require a Linux or Mac system, but an alternative is explained later in this section. Note that it may take several seconds for RHOCP to get the demo web application running.
export ROUTE=$(oc -n acm-grc-ansible-example get route | tail -1 | tr -s ' ' | cut -f 2 -d ' ')
echo | openssl s_client -showcerts -connect "$ROUTE:443" 2>/dev/null | openssl x509 -inform pem -noout -text
Here is a snippet of what should be displayed:
Issuer: C = US, ST = NC, L = Raleigh, O = Example, CN = www.example.com
Not Before: Jul 15 16:33:29 2021 GMT
Not After : Aug 9 16:33:29 2021 GMT
Subject: C = US, ST = NC, L = Raleigh, O = Example, CN = www.example.com
Alternatively, you can visit the URL of the RHOCP
Route that is created in the
acm-grc-ansible-example RHOCP namespace with your web browser and examine the certificate.
Setting Up Ansible Tower
In order to be able to create a ServiceNow incident when the certificate is near expiration, you must create an Ansible automation, specifically in Ansible Tower. The open-cluster-management/grc-ansible-integration-blog GitHub repository contains an example Ansible playbook that is used for this blog. This playbook uses a local Ansible connection to create a temporary Python virtual environment, installs the Python dependencies in it that are required for the snow_record Ansible module, and creates a ServiceNow incident using the aforementioned Ansible module. See, Working With Modules for more information.
Creating the Fork
In order to configure the Ansible playbook to use your ServiceNow instance, start by creating a fork of the repository on a Git forge (e.g. GitHub, internal GitLab, etc.) that your Ansible Tower instance has access to read. Once you have done so, proceed to complete the following steps:
Clone the fork locally.
From the CLI, create an Ansible vault (encrypted file) at
ansible/vaults/secret-vars.yml, in the directory of the cloned forked repository. This vault must include the variables
snow_username. After creating the vault, store the vault password securely. If a Linux or Mac system is being used to create the file, the command might resemble the following example:
cat <<EOT >> ansible/vaults/secret-vars.yml
ansible-vault encrypt ansible/vaults/secret-vars.yml
Commit the vaulted file using
gitand push it to the
Configuring Ansible Tower
Now that the Ansible configuration is all set, Ansible Tower needs to be configured to be able to run it. Start by creating a new Ansible Tower project. For this blog, the SCM URL field value is the URL of the forked repository. The SCM UPDATE OPTIONS section is optional, but is relied upon in this blog post. View the following image of the Ansible Tower project named, GRC Ansible Integration Blog:
Once the project is created, create a new inventory and inventory source. This inventory contains a group called
create_ticket and is set to use a local Ansible connection, so no external host is required to run the playbook.
Once the inventory is created, sync the project from the
Projects page. Note that the
UPDATE OPTIONS section is not required, but is relied upon in this blog post. If the inventory file is not shown in the drop-down menu, type it in manually and hit the
Enter key, as shown in the following images:
Next, we need to create an Ansible Vault credential. This is so that Ansible Tower is set to decrypt the Ansible vault that was previously created. The Vault credential should resemble the following image:
At this point, Ansible Tower is configured to know about the playbook and inventory in the forked repository. It is also configured to decrypt the Ansible vault you previously created. Next, Ansible Tower needs to be configured to run the Ansible playbook. To do this, create an Ansible Tower job template. Note that the credential in the
CREDENTIALS section is the Ansible Tower vault credential previously created. Also, the
PROMPT ON LAUNCH checkbox next to the
EXTRA VARIABLES section is required. This is because RHACM provides
extra variables by default and also supports custom
extra variables. View the following image as a reference:
User Access Token
Lastly, RHACM requires the permission to launch the Ansible Tower job template that was previously created. To do so, create a token that grants you access to run the job template and securely store it.
Note: It is best practice to use a separate Ansible Tower service account with access to run that specific job template.
Setting Up RHACM
Configuring the Policy
Now that there is a demo web application running and Ansible Tower is configured for the RHACM governance integration, it’s time to create a policy! This example scenario requires a certificate policy that detects when a SSL/TLS certificate is within 30 days of expiring in the
acm-grc-ansible-example RHOCP namespace.
To do so, create the appropriate
PlacementRule objects with the following command:
oc apply -f \
PlacementRule object matches every managed cluster, including the hub cluster. It’s recommended to make this more specific for a production use-case. This also utilizes the
acm-grc-ansible-example RHOCP namespace previously created. If the demo web application is deployed on a managed cluster and not the hub cluster, an RHOCP namespace needs to be created on the hub cluster.
The interesting portion of this policy is how specific you can set your policy configuration. Notice how
include is set specifically to the
acm-grc-ansible-example RHOCP namespace. This is so that the Ansible automation, that is later configured, is only ever initiated for a certificate that is expiring in this RHOCP namespace. Additionally, the
minimumDuration value is set to
720h, which means that there is only a policy violation if the certificate is expiring in less than 30 days. View a portion of the created certificate policy with the aforementioned configuration:
When you examine the policies from the Governance page, your view might resemble the policy violation in the following image:
Configuring the Ansible Integration
Ansible Tower Credential
In order for RHACM to be able to connect to Ansible Tower to run the Ansible Tower job template previously created, a credential must be created within RHACM. Complete the following steps:
From the navigation menu, click on Credentials and then click the Add credentials button. In the Automation & other credentials section, select Red Hat Ansible Automation Platform.
Fill in the form with the following values:
- Credentials name:
- Ansible Tower host:
- Replace this value with the actual URL to the Ansible Tower instance.
- Ansible Tower token:
- Replace this value with the actual token that has access to run the job template.
- Credentials name:
Alternatively, you can create the credential from the CLI. For an example where the
token values are replaced with the base64 of the actual values, see the ansible-tower-credential.yml.
Connecting the Policy to the Ansible Tower Job Template
Now it’s finally time to put all that hard work to use and configure the policy to run the configured Ansible Tower job template on policy violations. From the Governance page in RHACM, a policy that has a cluster violation should be displayed as shown in the following image:
You’ll notice that there is a column named Automation. Click on the Configure link to view the side-panel. For the Credential section, select the ansible-tower credential created earlier. For the Job template field, select GRC Ansible Integration Blog. For the extra variables section, add
policy_name: ansible-example-certificatepolicy and
target_namespace: acm-grc-ansible-example. The configuration from the Automation violation policy side-panel might resemble the following image:
In the Schedule automation section, select Run once mode. This mode runs the Ansible Tower job template upon the first policy violation. Afterwards, it is immediately set to disabled until a RHACM administrator re-enables it. Finally, click the Save button. Your configuration form might resemble the following image:
Behind the scenes, this created a
PolicyAutomation RHOCP object. Alternatively, you can use the CLI to create this with the following command:
oc apply -f \
When you examine the file being applied, notice that the
PolicyAutomation RHOCP object creates an
AnsibleJob RHOCP object when initiated. This object is what is picked up by the Ansible Automation Platform Resource operator to initiate the Ansible job in Ansible Tower:
name: GRC Ansible Integration Blog
Examining the Automation Job
At this point, it is verified that the policy violation caused an
AnsibleJob object to be created in the RHOCP namespace, where the
Policy object was previously created in. Run the following commands to view the
AnsibleJob but replace
ansible-example-certificatepolicy-policy-automation-once-cdw4g with the value you should see on your end:
❯ oc -n acm-grc-ansible-example get AnsibleJob
❯ oc describe -n acm-grc-ansible-example AnsibleJob ansible-example-certificatepolicy-policy-automation-once-cdw4g
API Version: tower.ansible.com/v1alpha1
job_template_name: GRC Ansible Integration Blog
Ansible Job Result:
Secret Namespaced Name: default/ansible-tower
Template Name: GRC Ansible Integration Blog
Verify SSL: false
Message: Monitor the job.batch status for more details with the following commands:
'kubectl -n default get job.batch/ansible-example-certificatepolicy-policy-automation-once-cdw4g'
'kubectl -n default describe job.batch/ansible-example-certificatepolicy-policy-automation-once-cdw4g'
'kubectl -n default logs -f job.batch/ansible-example-certificatepolicy-policy-automation-once-cdw4g'
When you examine the output, there are few interesting things to note:
- The first is that the URL to the initiated Ansible Tower job is shown.
- If you notice in the
extra_varssection, it in fact contains the extra variable of
target_namespacethat was previously configured, but there is also the additional variable of
target_clusters. This variable is automatically supplied by RHACM, and it contains the names of the clusters that violate the configured policy.
- Lastly, there is a message that explains that an RHOCP
Jobobject was created to actually run the Ansible Tower job template and wait for it. Run the following command to check the logs and view the progress but replace
ansible-example-certificatepolicy-policy-automation-once-cdw4gwith the value retrieved from the previous commands:
❯ oc -n default logs -f job.batch/ansible-example-certificatepolicy-policy-automation-once-cdw4g
PLAY [localhost] ***************************************************************
TASK [job_runner : Read AnsibleJob Specs] **************************************
TASK [job_runner : awx.awx.tower_job_launch] ***********************************
TASK [job_runner : Update AnsibleJob definition with Tower job id] *************
TASK [job_runner : Update AnsibleJob status with Tower job status and url] *****
TASK [job_runner : tower_job_wait] *********************************************
TASK [job_runner : Update AnsibleJob status with Tower job result] *************
PLAY RECAP *********************************************************************
localhost : ok=6 changed=4 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
You can also verify that the job was initiated successfully from Ansible Tower:
If the Ansible Tower job failed due to a Python import error, you may need to use an Ansible Tower Ansible virtual environment that uses Python 3. Another alternative is to use an Ansible Tower container execution environment. If you do not have access to configure the Ansible Tower instance that you are using, you can set the extra variable
python_path_override: /var/lib/awx/venv/awx/lib64/python3.6 on the job template, but it is not a recommended solution.
Note Be sure to use the correct Python version for your environment.
The ServiceNow incident is now created. The
Description field includes the text from the
extra variables (
target_clusters) supplied by RHACM to Ansible Tower.
At this point, imagine that the incident is assigned to one or more maintainers of the web application to renew the SSL/TLS certificate, and update the RHOCP secret. Note that after the Ansible Tower job template is initiated, the automation mode associated with the configured policy is set to
disabled. Therefore, after the RHOCP secret is updated with the renewed SSL/TLS certificate, the automation mode must be reset to
once by an RHACM administrator or using GitOps for the next time that the certificate nears expiration.
To clean up the demo web application, policy, and Ansible Tower credential in RHOCP, run the following command to delete the RHOCP namespace that was created as part of this blog:
oc delete ns acm-grc-ansible-example
Other Use Cases
Although this demo showcased Ansible automation being initiated from a
CertificatePolicy, this can be done with any policy type. For example, you can create a ServiceNow incident for an
IamPolicy violation when the number of cluster administrators exceeds the expected amount. The steps are similiar to what is outlined in this blog except using a different policy, and requiring a different
short_description value in the playbook. View the following example of what the difference may look like:
diff --git a/ansible/playbooks/create_ticket.yml b/ansible/playbooks/create_ticket.yml
index 9f4bd2f..4afeb1a 100644
@@ -54,8 +54,7 @@
short_description: "ACM violation"
- " violation: one or more certificates are expiring soon in a secret in
- the namespace on the clusters:
+ "The number of cluster admins exceeds the expected amount on the clusters:
Additionally, you can choose to perform any automation that Ansible is capable of instead of creating a ServiceNow incident. This flexibility enables many automation scenarios to fit your needs.