The Source to Image (S2I) builders of OpenShift 3 provide a convenient way for quickly building and then deploying your application directly from source code.

If you have special requirements, you may want to customize the steps run during the build and deployment phases. You may for example want to add special post build or pre deployment steps.

In these cases you have a few choices for how you can override the default behaviour without needing to create your own S2I builder image from scratch, or fork the source code for the original S2I builder. This post will explain how the S2I process works and what those choices are.

Build and deployment of your application

To trigger a S2I build with OpenShift using the command line, you can run the oc new-app command like:

oc new-app python:2.7~https://github.com/demo/wsgi-hello-world.git

In this form of the command, python:2.7 is the name of the S2I builder to be used. The URL after the ~ is the location of the source code for the Python web application to deploy.

Many of the details of how a S2I builder works is hidden and everything just appears to magically work. Before we look at the options for overriding the S2I scripts, we need to first dig a bit into what happens when you run this command.

The first part of the puzzle happens in the build phase for your application. This is where a Docker image is produced, combining your application code with the runtime environment required for your particular application stack.

The required runtime environment in this case is provided by way of the Docker image for the S2I builder. That is, the S2I builder image acts as the base image for the final application image.

If you were using Docker directly there are two ways you might construct an application image from a base image. The first approach is to create a Dockerfile which builds off the base image and add to the Dockerfile the commands to copy in and set up your application code. You would then run docker build in the directory where the Dockerfile and your application code is located to create the application image.

A second less commonly used way is to run the base image using docker run, either running an interactive shell where you can manually run commands to set up the image, or run some script included in the image to assemble everything together. When done and the container exits, you would use docker commit to create a Docker image from the stopped container.

When using a S2I builder with OpenShift, it is a variation of the second approach which is used.

Obviously in an automated system you can't be manually running the commands, so S2I uses an assemble script to do the work of copying the application source code into the correct location and then installing any dependencies which the application may require.

The process by which this works is effectively equivalent to running the following commands with Docker.

cat files.tar | \
docker run -i --name mybuild mybuilder bash -c \
"tar -C /tmp -xf - && /usr/libexec/s2i/assemble"

docker commit mybuild myapp:latest

The specific steps occurring here being:

  1. The application source code, along with any other required builder script files or assets, are packaged up into a tar archive.
  2. The S2I builder is run, with the archive file being injected into the executing container by sending it in as input to the command being run in the container.
  3. The command run within the container reads the contents of the archive sent in as input to the container and extracts it into a temporary directory.
  4. An assemble script is then run, which will copy into place any application code from the temporary directory and trigger any other builds steps.

The most important part of this and which is what distinguishes one S2I builder from another is what the assemble script does. By default this assemble script is a part of the S2I builder image. If we want to override what occurs, we need to replace this script with our own.

The second part of the puzzle is what now happens when we deploy the application image which has been created. The assemble script has ensured the application image has been setup as required, but when we run the image we want the application itself to now be started.

This is where the S2I process steps in and ensures the application is started up correctly. This is by way of a run script which is included along with the assemble script. The S2I process sets up the final application image such that the CMD for the image will execute the corresponding run script.

When overriding the assemble script, we will therefore also need to override this run script.

Including S2I scripts in the application code

The simplest way to override the default S2I scripts for a single application is to include your S2I scripts in the source code repository for your application.

If including the S2I scripts along with your application code, they need to be located in the .s2i/bin directory of the source code repository. When the S2I process inspects the source code repository and finds them, it will extract a separate copy of them and include them in the tar archive injected into the S2I builder image when it is run. The assemble script it made a copy of will then be executed rather than the default assemble script included with the S2I builder.

Note that the required name for this subdirectory was originally .sti/bin, but was changed to .s2i/bin. If using an older version of OpenShift you may have to use the older name. Both locations should work for newer versions of OpenShift, but the preferred location is now .s2i/bin.

Pulling S2I scripts from a code repository

If it isn't practical for you to have the custom S2I scripts as part of the source code repository for an application, or you needed to use the same scripts with multiple applications, the next option you have is to host the scripts on a separate web server. The OpenShift build configuration for your application would then be set up to pull the S2I scripts from the separate web server.

This method relies on using the bc.spec.strategy.sourceStrategy.scripts field of the build configuration.

$ oc explain bc.spec.strategy.sourceStrategy.scripts
FIELD: scripts <string>

DESCRIPTION:
Scripts is the location of Source scripts

Currently there is no way to set this when running oc new-app or oc new-build. You would need to create a build configuration and load it with oc create, or modify an existing build configuration using oc edit or oc patch.

oc patch bc/myapp -p '{"spec":{"strategy":{"sourceStrategy":{"scripts": "https://raw.githubusercontent.com/demo/wsgi-s2i-scripts/master"}}}}'

Note that in this case, because we want to pull the scripts from a source code repository on GitHub, we need to use the special URL for GitHub which provides access to the raw files via a HTTP request. That URL will not be directly browsable, but is the directory in which your assemble and run scripts would be located. You could instead host the scripts via any other web server.

Creating a new derived builder image

In the case of pulling the S2I scripts from a separate web server, although it allowed the custom scripts to be applied to multiple applications, a change to the scripts can't be used to trigger a new build of all applications relying on the scripts.

If you need to be able to trigger builds when the custom S2I scripts are changed, you will need to create a new derived Docker image. In this image you will supply your own versions of the scripts. A new build of applications dependent on the scripts, would then be able to be triggered by updating the builder image.

FROM centos/python-27-centos7

USER root

COPY assemble /usr/libexec/s2i/
COPY run /usr/libexec/s2i/

USER 1001

If you override the existing scripts by copying over the originals, then you may need to switch to the root user to ensure you have the appropriate access to replace the originals. Ensure you set USER back to the integer UID of the default S2I user defined by the S2I builder image.

By replacing the original scripts you will not though be able to invoke them if your own scripts were intended just as wrappers around the originals.

A better way is to copy your versions of the S2I scripts to a new location and update the io.openshift.s2i.scripts-url label on the Docker image, which is used to indicate where the scripts are located.

FROM centos/python-27-centos7

COPY assemble /opt/app-root/s2i/bin/
COPY run /opt/app-root/s2i/bin/

LABEL io.openshift.s2i.scripts-url="image:///opt/app-root/s2i/bin"

This new Docker image could be built externally to OpenShift and pushed up to a registry so it can be used, or you could use oc new-build to build it in OpenShift. Subsequent builds would then use the resulting Docker image as the S2I builder instead of python:2.7.

Note that we create a new derived Docker image rather than fork the source code repository for the original builder. This is because if a fork of the source code for the original builder was made, you would then need to maintain it and keep it up to date with respect to the original source code.

By instead creating a derived image and only overriding the S2I scripts, we can rely on triggers to automatically rebuild the derived Docker image when the base image for the original builder is updated. This ensures we automatically get any security updates in any packages installed as part of the original builder image, as well as any of its base images.

Wrapping existing S2I builder scripts

As already mentioned, one reason for wanting to provide your own S2I builder scripts is to wrap the default scripts provided with a S2I builder, rather than replace them. You may want to do this to perform additional post build actions or pre deployment steps. A couple of warnings are worth mentioning if wanting to do this.

The first is that if wanting to run actions prior to the original assemble script being run, be aware that your application source code will not have been copied into place at that point. This is because it is the assemble script that copies it to the final location. You will need to look at how the existing assemble script works and replicate the steps if the additional actions you need to run require the source code to be in its final location.

The second is that when wrapping the run script, ensure that the original run script is invoked using exec so that the process for the original script replaces the current process. This is to ensure that the original run script still runs as process ID 1. If this is not done, signal propagation to your hosted application may not work correctly and you will see issues when the container is being shut down.