Insights recommendations as Prometheus alerts
Insights Advisor for OpenShift was introduced almost two years ago and its user interface was tightly integrated with Red Hat Hybrid Cloud Console (formerly OpenShift Cluster Manager). Insights uses predictive analytics and deep domain expertise to reduce complex operational tasks from hours to minutes, including identifying security and performance risks. The analytics combines signals from all OpenShift components, known alerts and additional component configurations and provides cluster-wide Insights recommendations - actionable and tailored steps to proactively avoid potential issues on your clusters or resolve already existing ones. Previously Insights recommendations were visible on OpenShift WebConsole Dashboard and within Red Hat Hybrid Cloud Console. In the latest version of OpenShift Container Platform (4.12.0), Insights recommendations are available as Prometheus alerts . These are info-level alerts that are part of the in-cluster monitoring stack, are visible in OCP WebConsole and can be processed as any other alert. See the “InsightsRecommendationActive” alert in the status section below.
This feature doesn’t require you to check the OCP WebConsole periodically to see the Insights recommendations status of your cluster. You can configure the OpenShift alert manager to be notified of active Insights recommendations through all the supported integrations.
How does it work? First of all, the cluster has to have Remote health monitoring enabled (this feature is turned on by default for every new cluster). The Insights operator sends cluster metadata and obtains the corresponding Insights analysis. When the operator obtains the latest Insights analysis, it reads the active Insights recommendations and for every active recommendation it registers a new Prometheus metric providing basic information (in the labels) about the corresponding recommendation. The alert definition is based on those metrics and provides basic information as description, total risk of the recommendation and most importantly the link to the Insights advisor where you can find the complete description of the respective recommendation including remediation steps. (How to navigate in Insights Advisor?)
It is also good to keep in mind that there is an extra alert for each active Insights recommendation. This allows additional integrations through Alert Manager as mentioned.
You fixed the issue in your cluster reported by the Insights service, but your alert is still firing. Now what? Here it is important to bear in mind that the Insights service must receive new metadata indicating the issue is fixed. That upload happens every 2 hours by default. The Connected Customer Experience team is actively working on a new feature that will allow you to refresh the data almost immediately, more on that in one of the future OCP releases.
Insights for OpenShift offers many other features, some of them are hidden to you and improve your experience with the product. You can continue reading more about how we improve OpenShift with Insights analytics. With this new feature, we are offering you an easy way to stay on top of Insights recommendations. You can monitor them without a need to periodically check OpenShift WebConsole or Hybrid Cloud Console. This way you’ll stay ahead of any potential cluster issues that might have a significant impact on your applications and customers.