Introduction

Performing container builds in isolated environments is one step towards defending against the  threat to the CI/CD pipelines while at the same time providing flexibility to the developers.

This article will analyze and demo different ways to use sandboxed containers with well-known container image build systems.

Why OpenShift sandboxed containers?

Complex CI/CD pipelines often include many jobs which perform builds. Sometimes one job's output affects the jobs scheduled later on in the pipeline; some jobs may require the use of software that needs privileged access.

With OpenShift sandboxed containers, you can safely install software that needs privileged access without affecting the container host or the other containers running on the same host.

"99% of our vulnerabilities are not in the code you write in your application. They're in a very deep tree of dependencies, some of which you may know about, some of which you may not know about." - Eric Brewer

Let's assume your build system is compromised; it happens, and then attackers could gain control over the host system and use it to continue their attack. With OpenShift sandboxed containers, the attacker will escape to the virtual machine instead of the host system. All damage done will be limited to this environment only because this additional layer prevents direct access to the host.  

Let's look into some typical ways to run builds in OpenShift clusters using sandboxed containers.

Buildah builds

Let’s start with an example of using buildah.

cat >build.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
 name: buildah
 namespace: sandboxed-builds
spec:
 runtimeClassName: kata
 containers:
   - name: buildah
     image: quay.io/buildah/stable:v1.30
     command: ["sleep", "infinity"]
     securityContext:
       privileged: true
EOF

or

cat >build.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
 name: buildah
 namespace: sandboxed-builds
spec:
 runtimeClassName: kata
 containers:
   - name: buildah
     image: quay.io/buildah/stable:v1.30
     command: ["sleep", "infinity"]
     securityContext:
      capabilities:
        add:
          - "SYS_ADMIN"
          - "MKNOD"
          - "SYS_CHROOT"
          - "SETFCAP
EOF

 

Notice how the only change to the definition of the Pod is the addition of the 'runtimeClassName: kata'. RuntimeClasses are a Kubernetes feature that lets us switch the low-level container runtime from the default to the Kata runtime.

As the next step we create a namespace

oc create ns sandboxed-builds

And then run the Pod that contains the buildah tool

oc apply -f build.yaml

Now we enter the buildah container with

oc exec -it buildah bash

Create a build directory

mkdir /build && cd /build

Now to run a build with an example Dockerfile, we can use buildah as usual

Example Dockerfile:

cat >Dockerfile <<EOF
FROM quay.io/fedora/fedora:38
RUN date
EOF
buildah bud --storage-driver vfs -f Dockerfile .

Note the usage of "vfs" storage driver.

This was the most simple use case, now let’s look at another example using kaniko.

Kaniko builds

With kaniko we can build containers from a Dockerfile in a Kubernetes cluster.

Kaniko runs in an unprivileged container, but the container has to run as a 'root' user. This is because it uses 'chroot' to build the image.

There are ways to harden the security of the kaniko build process by making it drop all capabilities and add back only the ones it needs.

On top of this already enhanced security of default kaniko builds, we can make it even more secure by running it in a sandboxed container.

As in the previous example, to turn our example build using kaniko into a sandboxed build using OpenShift sandboxed containers, we add the `runtimeClassName: kata` field to the Pod's specification.

To run the example you first create the Pod specification:

cat >kaniko.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
name: kaniko
namespace: sandboxed-builds
spec:
containers:
- name: kaniko
  image: gcr.io/kaniko-project/executor:latest
  args:
  - "--context=git://github.com/bpradipt/perf-container.git"
  - "--dockerfile=Dockerfile"
  - "--destination=quay.io/bpradipt/kaniko-demo-image:1.0"
  volumeMounts:
  - name: kaniko-secret
    mountPath: /kaniko/.docker
restartPolicy: Never
runtimeClassName: kata
volumes:
- name: kaniko-secret
  secret:
    secretName: regcred
    items:
      - key: .dockerconfigjson
        path: config.json
EOF

As a build context we provide a github repository with a Dockerfile.

To run the build now we first create a namespace

oc create ns sandboxed-builds

And then create the Pod

oc apply -f kaniko.yaml

Looking at the logs,

oc logs kaniko -f

We can see that it all looks ordinary. The kaniko container still runs as a root user but now it is confined to a virtual machine.

Now let's look at a few advanced examples using the overlay storage driver with buildah. Overlay storage driver leverages fuse-overlayfs and provides better build time performance.

Buildah builds using "overlay" storage driver

When using "overlay" storage driver with sandboxed containers, you'll need to ensure that the directory used to store the container images is one of the following:

  1. Separate volume mount (e.g., memory backed, disk backed, and so on)
  2. Loop mounted disk

Using memory backed directory

cat >build-emptydir.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
name: buildah-emptydir
namespace: sandboxed-builds
spec:
containers:
  - name: buildah
    image: quay.io/buildah/stable:v1.30
    command: ["sh", "-c"]
    args:
    - mknod /dev/fuse -m 0666 c 10 229 && sleep infinity
    securityContext:
      privileged: true
    volumeMounts:
    - mountPath: /var/lib/containers
      name: container-storage
runtimeClassName: kata
volumes:
- name: container-storage
  emptyDir:
   medium: Memory
EOF
oc create ns sandboxed-builds
oc apply -f build-emptydir.yaml
oc exec -it buildah-emptydir -n sandboxed-builds  bash
mkdir /build && cd /build
cat >Dockerfile <<EOF
FROM quay.io/fedora/fedora:33
RUN date
EOF
buildah bud  -f Dockerfile .

Adjusting the size of memory backed directory

The “/var/lib/containers” directory will be tmpfs mounted and typically its 50% of available RAM. For Kata default VM size is 2G, so the tmpfs mounted dirs will be roughly of size ~1G. Either you can increase the default VM size as per your requirement, or you can take the following approach.

The example shows how to provision ~3G of storage for containers using tmpfs.

cat >build-emptydir.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
name: buildah-emptydir
namespace: sandboxed-builds
spec:
containers:
  - name: buildah
    image: quay.io/buildah/stable:v1.30
    command: ["sh", "-c"]
    args:
    - mkdir -p /var/lib/containers &&
      mount -t tmpfs tmpfs /var/lib/containers &&
      mknod /dev/fuse -m 0666 c 10 229 && sleep infinity
    resources:
      limits:
        memory: 6G
    securityContext:
      privileged: true
runtimeClassName: kata
EOF
oc create ns sandboxed-builds
oc apply -f build-emptydir.yaml
oc exec -it buildah-emptydir -n sandboxed-builds  bash
mkdir /build && cd /build
cat >Dockerfile <<EOF
FROM quay.io/fedora/fedora:38
RUN date
EOF
buildah bud  -f Dockerfile .

Using loop mounted disk

This approach provides good performance as well as it's not dependent on the available VM memory.

cat >buildah-loop.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
name: buildah-loop
namespace: sandboxed-builds
spec:
runtimeClassName: kata
containers:
  - name:  buildah-loop
    image: quay.io/buildah/stable:v1.23.0
    command: ["sh", "-c"]
    args:
    - mknod /dev/loop0 b 7 0 &&
      dnf install -y e2fsprogs &&
      truncate -s 20G /tmp/disk.img &&
      mkfs.ext4 /tmp/disk.img &&
      mkdir -p /var/lib/containers &&
      mount /tmp/disk.img /var/lib/containers &&
      mknod /dev/fuse -m 0666 c 10 229 &&
      sleep infinity
    securityContext:
      privileged: true
EOF
oc create ns sandboxed-builds
oc apply -f buildah-loop.yaml
oc exec -it  buildah-loop  -n sandboxed-builds  bash
mkdir /build && cd /build
cat >Dockerfile <<EOF
FROM quay.io/fedora/fedora:38
RUN date
EOF
buildah bud  -f Dockerfile .

Summary

We looked at different ways of building an individual container using OpenShift sandboxed containers.

You can extend this approach to use OpenShift sandboxed containers in a typical CI/CD pipeline. Please read more about it in this blog.


About the authors

Pradipta is working in the area of confidential containers to enhance the privacy and security of container workloads running in the public cloud. He is one of the project maintainers of the CNCF confidential containers project.  

Read full bio

Jens Freimann is a Software Engineering Manager at Red Hat with a focus on OpenShift sandboxed containers and Confidential Containers. He has been with Red Hat for more than six years, during which he has made contributions to low-level virtualization features in QEMU, KVM and virtio(-net). Freimann is passionate about Confidential Computing and has a keen interest in helping organizations implement the technology. Freimann has over 15 years of experience in the tech industry and has held various technical roles throughout his career.

Read full bio