Today, the current Vertical Pod Autoscaler (VPA) recommends CPU/Memory requests based on one default recommender, which recommends future requests based on the statistics of historical usage observed in a rolling time window. As no universal recommendation policy can apply to different workloads for different customers, customers have a need to define their own customized recommendation policies.
For example, the default VPA recommendation policy would fail to capture the usage changes for periodic and trendy behaviors, as shown in Figure 1 and Figure 2, which are common resource usage behaviors observed in monitoring and caching workloads.
Figure 1 Default VPA recommendation policies on periodic CPU usage.
Figure 2 Default VPA recommendation policies on trendy CPU usage.
So, there is a need for the default VPA to support customized recommenders developed by customers for different types of workloads.
OpenShift is Red Hat’s enterprise Kubernetes distribution. This means OpenShift has the same VPA as the upstream VPA, which has the same default recommendation policy. We in Red hat always listen to our customers and partners that there is a need to bring in their own VPA recommendation policies, which can best run for their workloads.
We recently contributed to the upstream VPA to allow the support of configuring an alternative recommender for different workloads. As shown in Figure 3, the new feature allows customers to specify a different customized recommender for a particular VPA object instead of using the default one. Thus, users and developers can specify different recommenders for different VPA objects, which govern different workloads exhibiting distinct resource usage behaviors.
Figure 3 Alternative Recommender Support in VPA
The way to specify an alternative recommender is intuitive. Users just need to give the name of the customized recommender For example, when customers define a VPA object, it can specify the customized recommender name under spec.recommenders.name as specified in ${customized_recommender_name}.
apiVersion: "autoscaling.k8s.io/v1"
kind: VerticalPodAutoscaler
metadata:
name: hamster-vpa
spec:
recommenders:
- name: ${customized_recommender_name}
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: ${deployment_name}
resourcePolicy:
containerPolicies:
- containerName: '*'
minAllowed:
cpu: 100m
memory: 50Mi
maxAllowed:
cpu: 1
memory: 500Mi
controlledResources: ["cpu", "memory"]
This customized VPA recommender support is also available in OpenShift 4.11.
Example
In the following, we will walk you through how easy it is to install an example predictive VPA recommender via the default VPA operator.
STEP 1: Install the VPA operator via OpenShift Operator Hub.
Click install to install the default VPA operator.
Please choose the default configurations to install the VPA operator.
STEP 2: Deploy a customized VPA recommender.
In this step, we use the predictive-vpa-recommenders to deploy it as a customized recommender to run with the default VPA controllers.
- We first built one predictive VPA recommender named “pando”.
- Update necessary configurations for pando-recommender-deployment.yaml.
- Deploy the updated pando-recommender-deployment.yaml.
> docker login quay.io/${user_id}
> git clone https://github.com/openshift/predictive-vpa-recommenders.git
> cd predictive-vpa-recommenders
> docker build -t quay.io/${user_id}/predictive-vpa-recommender:latest .
> docker push -t quay.io/${user_id}/predictive-vpa-recommender:latest
- First, please replace the ${user_id} to your container image repo user ID.
- Then, follow the tutorial on Enabling monitoring for user-defined projects to allow the pando-recommender to fetch data from Prometheus.
- Update the ${PROM_HOST} and ${PROM_TOKEN} by the following variables.
- You can also change the RECOMMENDER_NAME via the recommender-config configmap. We here choose RECOMMENDER_NAME: “pando” for recommender selection purpose in Step 3.
> export SECRET=`oc get secret -n openshift-user-workload-monitoring | grep prometheus-user-workload-token | head -n 1 | awk '{print $1 }'`
> export PROM_TOKEN=`echo $(oc get secret $SECRET -n openshift-user-workload-monitoring -o json | jq -r '.data.token') | base64 -d`
> export PROM_HOST=`oc get route thanos-querier -n openshift-monitoring -o json | jq -r '.spec.host'`
> oc create -f manifests/openshift/pando-recommender-deployment.yaml
Then, we can check running pods under openshift-vertical-pod-autoscaler namespace to see if pando recommender is running.
> oc get pods -n openshift-vertical-pod-autoscaler
NAME READY STATUS RESTARTS AGE
pando-7c8fbd6d47-75x4k 1/1 Running 0 21s
vertical-pod-autoscaler-operator-547c78cd5b-k8p5h 1/1 Running 0 24m
vpa-admission-plugin-default-8448d994c9-8qd62 1/1 Running 0 24m
vpa-recommender-default-85c6c4d57d-tlch8 1/1 Running 0 24m
vpa-updater-default-7c85ddffb8-slvlh 1/1 Running 0 24m
STEP 3: Deploy the workload deployment and the VPA object managing the workload
In this step, we first create a testing workload.
> cat testing-periodic-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-periodic
spec:
selector:
matchLabels:
app: test-periodic
replicas: 2
template:
metadata:
labels:
app: test-periodic
spec:
securityContext:
runAsNonRoot: true
runAsUser: 65534 # nobody
containers:
- name: test-periodic
image: quay.io/chenw615/periodic-load:latest
imagePullPolicy: Always
resources:
requests:
cpu: 100m
memory: 50Mi
command: ["/bin/sh"]
args:
- "/periodic.sh"
- "1200"
- "60"
> oc create -f testing-periodic-deployment.yaml
Then, we define and create a VPA object to control this workload using pando recommender. The recommender name is specified under spec.recommenders.
> cat testing-periodic-vpa.yaml
apiVersion: "autoscaling.k8s.io/v1"
kind: VerticalPodAutoscaler
metadata:
name: test-periodic-vpa
spec:
recommenders:
- name: pando
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: test-periodic
resourcePolicy:
containerPolicies:
- containerName: '*'
minAllowed:
cpu: 100m
memory: 50Mi
maxAllowed:
cpu: 2
memory: 1Gi
controlledResources: ["cpu", "memory"]
Then, we can see from the customized pando recommender’s logs that the test-periodic-vpa is selected and the deployment of 'test-periodic' is analyzed.
> oc logs pando-7c8fbd6d47-75x4k -n openshift-vertical-pod-autoscaler
{'recommenders': [{'name': 'pando'}], 'resourcePolicy': {'containerPolicies': [{'containerName': '*', 'controlledResources': ['cpu', 'memory'], 'maxAllowed': {'cpu': 2, 'memory': '1Gi'}, 'minAllowed': {'cpu': '100m', 'memory': '50Mi'}}]}, 'targetRef': {'apiVersion': 'apps/v1', 'kind': 'Deployment', 'name': 'test-periodic'}, 'updatePolicy': {'updateMode': 'Auto'}}
{'apiVersion': 'apps/v1', 'kind': 'Deployment', 'name': 'test-periodic'}
rate(container_cpu_usage_seconds_total{namespace='default',container='test-periodic'}[1m])
container_memory_usage_bytes{namespace='default',container='test-periodic'}
Forecast cpu resource for Container test-periodic at 03:40:00
Trace Behavior Label: 12
Trace Forecaster Selected: theta
Forecasts: [0.1467253 0.14668293 0.93619717 0.93589763 0.99966014 0.99931881
0.0789116 0.07888437 0.00251754 0.00251665 0.00256805 0.00256697
0.00331901 0.00331838 0.00250759 0.00250705 0.00275479 0.00275385
0.00261838 0.00261735]
Provision: 0.9993358793009247
Forecast memory resource for Container test-periodic at 03:40:00
Trace Behavior Label: 7
Trace Forecaster Selected: naive
Forecasts: [2478080. 2478080. 2453504. 2453504. 2707456. 2707456. 2936832. 2936832.
2445312. 2445312. 2437120. 2437120. 2650112. 2650112. 2457600. 2457600.
2547712. 2547712. 2457600. 2457600.]
Provision: 2936832.0
Successfully patched VPA object with the recommendation: [{'containerName': 'test-periodic', 'lowerBound': {'cpu': '942m', 'memory': '50Mi'}, 'target': {'cpu': '999m', 'memory': '50Mi'}, 'uncappedTarget': {'cpu': '999m', 'memory': '2Mi'}, 'upperBound': {'cpu': '999m', 'memory': '50Mi'}}]
…..
And at the same time, if we look at the logs of the default recommender, we can see there is 0 VPA objects fetched and selected.
> oc logs -f vpa-recommender-default-85c6c4d57d-tlch8 -n openshift-vertical-pod-autoscaler
….
I0729 17:45:38.089839 1 recommender.go:188] Recommender Run
I0729 17:45:38.089923 1 cluster_feeder.go:349] Start selecting the vpaCRDs.
I0729 17:45:38.089953 1 cluster_feeder.go:374] Fetched 0 VPAs.
Categories
News, How-tos, cloud scale, massive scale, OpenShift 4, openshift 4.11