A Kubernetes DaemonSet ensures that an instance of a specific pod is running on all (or a selection of) nodes in a cluster. It creates pods on each node, and garbage collects pods when nodes are removed from the cluster.
The simplest use case is deploying a daemon on every node. However, you might want to split that up into multiple daemon sets. For example, if you have a cluster with nodes of varying hardware, they might need adaptation in the memory and/or cpu requests you include for the daemon.
As our approach fit with this use case, we decided to create a DaemonSet that would deploy pods running netperf
’s netserver
server-side binary in the background. We thought this might be useful for analyzing networking performance within the OpenShift Container Platform (OCP) cluster.
This post shows how we constructed a netperf DaemonSet from scratch.
Dockerfile
First of all, we need to create a custom docker image that will run the netserver
binary.
FROM fedora:27
MAINTAINER josgonza@redhat.comRUN \
dnf clean all && \
dnf install http://people.redhat.com/mcroce/packages/netperf-2.7.1-3.x86_64.rpm -yUSER 1001
ENTRYPOINT ["/usr/bin/netserver", "-D"]
EXPOSE 12865
NOTE: this container doesn’t need privileged rights so you won’t have to grant them Enable Container Images that Require Root.
This Dockerfile is just for testing purposes and to keep this example as simple as possible, but we strongly recommend following best practices when you create your containers:
- Container Image Guidelines
- 10 things to avoid in docker containers
To avoid the complexity of generating the binaries from scratch, we used the RPM netperf-2.7.1-3.x86_64.rpm, courtesy of Matteo Croce (former rpm from Fedora COPR
repository teknoraver/netperf).
Once you have the Dockerfile
you only need to build the image, ex: docker build -t netperf-fedora
. For testing purposes, you could run it and connect to the container:
docker run -d --name netperf netperf-fedora
docker exec -ti netperf /bin/bash
Finally, tag
and push
the image to your image registry.
DaemonSet Manifest
Create a DaemonSet manifest with the following contents:
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: netperf
namespace: <-your_project->
spec:
selector:
matchLabels:
name: netperf
template:
metadata:
labels:
name: netperf
app-name: netperf
spec:
nodeSelector:
type: NODE
stage: NON_PRODUCTION
containers:
- image: <-your_registry->/netperf-fedora:latest
imagePullPolicy: Always
name: netperf
ports:
- containerPort: 12865
protocol: TCP
resources:
limits:
memory: 256MB
requests:
memory: 256MB
resources: {}
terminationMessagePath: /dev/termination-log
terminationGracePeriodSeconds: 10
Note the .spec.nodeSelector
tags. We decided to use non-production computing
nodes (not masters
or infra
nodes) to avoid any impact on production workloads, while still being deployed inside the OCP cluster. Check the DaemonSet docs for details about DaemonSet manifests.
Deploy Daemonset in OCP
Once you have created the YAML for the DaemonSet manifest, login with rights/permissions to modify the selected project (.metadata.namespace
in the manifest). Then you can:
- Create/deploy the DaemonSet
oc create -f netperf-daemonset.yml
- Monitor it
oc get daemonset
oc get event --sort-by='.lastTimestamp' - Delete/Undeploy it
oc delete daemonset netperf --cascade
Automation of the netperf tests
Now that you’ve deployed a netperf
DaemonSet and its pods are running the netserver
daemon, you can execute your netperf
client tests from any point of the infrastructure within your OCP cluster.
This bash snippet loops through a list of nodes to collect statistics from the netserver
daemon pod on each of them:
TSEC=30
ITERATIONS=5for HOST in $(oc get nodes -o jsonpath='{range .items[?(.metadata.labels.stage=="NON_PRODUCTION")]}{.metadata.name}{"\n"}{end}');
do...
for iteration in $(seq ${ITERATIONS})
do
yes | ssh $HOST "./netperf -t TCP_STREAM -cC -l ${TSEC} -H ${POD_IP} " | tee -a logs/${iteration}_TCP_STREAM.log
yes | ssh $HOST "./netperf -t TCP_MAERTS -cC -l ${TSEC} -H ${POD_IP} " | tee -a logs/${iteration}_TCP_MAERTS.log
yes | ssh $HOST "./netperf -t TCP_RR -cC -l ${TSEC} -H ${POD_IP} " | tee -a logs/${iteration}_TCP_RR.log
yes | ssh $HOST "./netperf -t TCP_CRR -cC -l ${TSEC} -H ${POD_IP} " | tee -a logs/${iteration}_TCP_CRR.log
done...
done
...
NOTE: about the outer loop, it’s recommended to filter the OCP nodes (at least to discard the nodes where the DaemonSet has not been deployed). As the
jsonpath
option has a limited filtering functionality, you can useawk
instead if you want a subset of nodes.
Variables
- TSEC (30): This option controls the length of any one iteration of the requested test.
- ITERATIONS (5): Number of iterations.
- HOST: IP/FQDN of the host from you want to execute the tests (
netperf
binary must exists or the script have to copy it with a previousscp
command, for example). - POD_IP: destination IP of the pod running the
netserver
binary listening for client requests.
See the Netperf documentation for more netperf options and features.
Here’s a quick way to parse the results files:
for i in $(ls -d *_TCP_MAERTS.log);do echo $i;awk '/Throughput/,/^[0-9]/{print $5}' $i | egrep -v "[a-zA-Z]"|sed '/^$/d';done
for i in $(ls -d *_TCP_STREAM.log);do echo $i;awk '/Throughput/,/^[0-9]/{print $5}' $i | egrep -v "[a-zA-Z]"|sed '/^$/d';done
Recommended Usage
I recommend having a bastion host with access to the entire OCP infrastructure, and using Ansible to automate the tests.
If you want a random selection of pods for each test rather than a static list, I suggest one of two approaches:
- Using OpenShift’s
oc
command line client and some classic UNIX CLI filter programs:POD=$(oc get po -o wide | grep netperf | awk {'print $6'} | shuf -n1)
- Using
endpoints
, so you need to create the netperfservice
:apiVersion: v1
kind: Service
metadata:
labels:
app-name: netperf
name: netperf
namespace: your_project
spec:
ports:
- port: 12865
protocol: TCP
targetPort: 12865
selector:
app-name: netperf
sessionAffinity: ClientIP
type: ClusterIP
And then
POD=$(oc export -n <-your_project-> ep/netperf | grep ip | awk {'print $3'} | shuf -n1)
NOTE: tested with oc v3.6.0
I recommend the second approach, creating a SVC
, because:
- You can use the
endpoints
to choose OCP nodes. This is quite helpful when you want to test from the same node where thenetperf
POD IP is deployed:NODE=$(oc export -n <-your_project-> ep/netperf | grep -A1 "${POD_IP}" | grep 'nodeName:' | awk {'print $2'})
- I tried to launch the test through the
SVC
but could not make it work (probably because TCP headers / NAT or combination of both)../netperf -t TCP_STREAM -cC -l 30 -H ${ClusterIP} #Failed with timeout
Any thoughts about how to fix this would be very appreciated.
Other interesting tests would be:
- From one master (in multi-master environment) to a netperf
POD IP: .
MASTER=$(oc get nodes -l type=MASTER --no-headers | awk '$2 == "Ready,SchedulingDisabled" {print $1}' | shuf -n1)
- From one node in the OCP cluster to a
netperf
POD IP. - From any other host deployed in the OCP cluster (like bastion hosts, monitoring hosts ..)
Conclusion
With Kubernetes at its core, OpenShift is a powerful platform that lets you deploy complex or tedious systems/applications in an easy way.
You can use DaemonSets to create shared storage, to run a logging pod on every node in a cluster, or to deploy a monitoring agent on every node, such as Dynatrace.
DaemonSets on OpenShift are also great because they provide useful abstractions for:
- Monitoring and managing logs for daemons in the same way as applications.
- Configuring daemons with the same formats and tools as applications, e.g., Pod templates.
- Running daemons in containers with resource limits to increase isolation between daemons and app containers.
Categories
OpenShift Commons, OpenShift Container Platform, OpenShift Dedicated, OpenShift Online