Transferring Files In and Out of Containers in OpenShift

This is part one of a three-part series. In this post, we’ll cover manually copying files into and out of a container. Part two will be about live synchronization. Finally, in part three, we’ll cover copying files into a new persistent volume.

Part one: Manually Copying Files

One of the properties of container images is that they are immutable. That is, although you can make changes to the local container file system of a running image, the changes are not permanent. When a new container is started from the same container image, it reverts back to what was originally built into the image.

Although any changes to the local container file system are discarded when the container is stopped, it can sometimes be convenient to be able to upload files into a running container. One example of where this might be done is during development when a dynamic scripting language is being used. By being able to modify code in the container, you can modify the application to test changes before rebuilding the image.

In addition to uploading files into a running container, you might also want to download files. During development, these may be data files or log files created by the application. In this post, we're going to cover how to transfer files between your local machine and a running container.

Before starting, make sure that you're logged into your OpenShift cluster through the terminal and have created a project. You'll be using just the oc command line tool. We're not going to be using the web console, but you can check the status of your project there if you wish.

Downloading files from a container

To demonstrate transferring files to and from a running container, we first need to deploy an application. To deploy our example application, run:

oc new-app openshiftkatacoda/blog-django-py --name blog

To access it from a web browser, we also need to expose it by creating a Route:

oc expose svc/blog

We can also monitor the deployment of the application by running:

oc rollout status dc/blog

This command will exit once the deployment has completed and the web application is ready.

The result will be a running container. You can see the name of the pods corresponding to the running containers for this application by running:

oc get pods --selector app=blog

You only have one instance of the application, so only one pod will be listed, looking something like this:

NAME            READY       STATUS      RESTARTS        AGE
blog-1-9j3p3 1/1 Running 0 1m

For subsequent commands which need to interact with that pod, you'll need to use the name of the pod as an argument. As you saw above, in this case, the pod would be blog-1-9j3p3.

To create an interactive shell within the same container running the application, you can use the oc rsh command, supplying it the environment variable holding the name of the pod.

oc rsh blog-1-9j3p3

From within the interactive shell, see what files exist in the application directory.

ls -las

You should receive some output similar to this:

total 72
0 drwxrwxrwx. 10 default root 252 Jul 7 05:03 .
0 drwxrwxrwx. 7 default root 104 Jul 6 11:58 ..
4 -rwxrwxr-x. 1 default root 284 Jul 6 11:57 app.sh
0 drwxrwxr-x. 6 default root 233 Jul 6 12:01 blog
40 -rw-r--r--. 1 1000040000 root 39936 Jul 7 05:03 db.sqlite3
4 -rw-rw-r--. 1 default root 430 Jul 6 11:57 Dockerfile
0 drwxrwxr-x. 2 default root 25 Jul 6 11:57 htdocs
0 drwxrwxr-x. 3 default root 93 Jul 6 12:01 katacoda
4 -rwxrwxr-x. 1 default root 806 Jul 6 11:57 manage.py
0 drwxrwxr-x. 3 default root 20 Jul 6 12:01 media
0 drwxrwxrwx. 3 default root 19 Jun 29 11:54 .pki
4 -rw-rw-r--. 1 default root 832 Jul 6 11:57 posts.json
8 -rw-rw-r--. 1 default root 7861 Jul 6 11:57 README.md
4 -rw-rw-r--. 1 default root 65 Jul 6 11:57 requirements.txt
4 -rw-rwxrwx. 1 default root 1024 Jun 29 12:34 .rnd
0 drwxrwxr-x. 3 default root 45 Jul 6 11:58 .s2i
0 drwxrwxr-x. 2 default root 93 Jul 6 11:57 scripts
0 drwxrwxr-x. 4 default root 30 Jul 6 12:01 static

For the application being used, this has created a database file like this:

40 -rw-r--r-- 1 1000040000 root 39936 Jun 6 05:53 db.sqlite3

Let's look at how this database file can be copied back to the local machine.

To confirm what directory the file is located in, inside of the container, run:

pwd

This should display:

/opt/app-root/src

To exit the interactive shell and return to the local machine, run:

exit

To copy files from the container to the local machine, you can use the oc rsync command.

To copy a single file from the container to the local machine, the form of the command you need to run is:

oc rsync <pod-name>:/remote/dir/filename ./local/dir

To copy our single database file from our pod, we run:

oc rsync blog-1-9j3p3:/opt/app-root/src/db.sqlite3 .

You should receive output like this:

receiving incremental file list
db.sqlite3

sent 30 bytes received 40027 bytes 26704.67 bytes/sec
total size is 39936 speedup is 1.00

Check the contents of the current directory by running:

ls -las

You should see that the local machine now has a copy of the file.

40 -rw-rw-r-- 1 1000040000 root 39936 Jun 6 05:53 db.sqlite3

Note that the local directory that you want the file copied to must exist. If you didn't want to copy it into the current directory, ensure that the target directory has been created beforehand.

In addition to copying a single file, a directory can also be copied. To copy a directory to a local machine, the form of the command you need to run is:

oc rsync <pod-name>:/remote/dir ./local/dir

To copy the media directory from the container for our pod, we run:

oc rsync blog-1-9j3p3:/opt/app-root/src/media .

If you wanted to rename the directory at the time of copying it, you should first create the target directory with the name you want to use:

mkdir uploads

Then, to copy the files, use this command:

oc rsync blog-1-9j3p3:/opt/app-root/src/media/. uploads

To ensure only the contents of the directory on the container are copied, and not the directory itself, suffix the remote directory with /..

Note: If the target directory contains existing files with the same name as a file in the container, the local file will be overwritten. If there are additional files in the target directory which don't exist in the container, those files will be left as is. If you want an exact copy, and to have the target directory always updated to be exactly the same as what exists in the container, use the --delete option with oc rsync.

When copying a directory, you can be more selective about what is copied by using the --exclude and --include options to specify patterns to be matched against directories and files, with them being excluded or included as appropriate.

If there is more than one container running within a pod, you'll need to specify which container you want to work with by using the --container option.

Uploading files to a container

To copy files from the local machine to the container, we'll again use the oc rsync command.

The command for copying files from the local machine to the container needs to be of the form:

oc rsync ./local/dir <pod-name>:/remote/dir

Unlike when copying from the container to the local machine, there's no form for copying a single file. To copy only selected files, you'll need to use the --exclude and --include options to filter what is and isn't copied from the specified directory.

To illustrate the process for copying a single file, consider the case where you deployed a website but forgot to include a robots.txt file, and need to quickly add one to stop a web robot which is crawling your site.

First, we create a robots.txt file in our local directory which contains:

User-agent: *
Disallow: /

For the web application being used, it hosts static files out of the htdocs subdirectory of the application source code. To upload the robots.txt file, we run:

oc rsync . blog-1-9j3p3:/opt/app-root/src/htdocs --exclude=* --include=robots.txt --no-perms

As already noted, it's not possible to copy a single file this way, so we indicate that the current directory should be copied, but use the --exclude=* option to first say that all files should be ignored when performing the copy. That pattern is then overridden for just the robots.txt file by using the --include=robots.txt file, ensuring that robots.txt is copied.

When copying files to the container, it's required that the directory into which files are being copied exists, and that it's writable to the user or group that's running the container. Permissions on directories and files should be set as part of the process of building the image.

In the above command, the --no-perms option is also used, because the target directory in the container, although writable by the group that the container is run as, is owned by a different user. This means that, although the files can be added to the directory, permissions on existing directories cannot be changed. The --no-perms option tells oc rsync to not attempt to update permissions; this avoids it failing and returning errors.

Now that the robots.txt file is uploaded, the request for it will succeed.

This worked without needing to take any further actions as the Apache HTTPD server being used to host static files automatically detects the presence of a new file in the directory. If your application doesn’t automatically detect new or changed files, you may need to notify it in some way to pick up the changes.

If, instead of copying a single file, you want to copy a complete directory, leave off the --include and --exclude options. To copy the complete contents of a directory to the htdocs directory in the container, you could run:

oc rsync images blog-1-9j3p3:/opt/app-root/src/htdocs --no-perms

Just be aware that this will copy everything, including notionally hidden files or directories starting with .. Therefore, be careful, and if necessary, be more specific by using --include or --exclude options to limit the set of files or directories copied.

Summary of Part One

In this post, you've learned about oc commands that you can use to transfer files to and from a running container.

You can find a summary of the key commands covered below. To see more information on each oc command, run it with the --help option.

  • oc rsync <pod-name>:/remote/dir/filename ./local/dir: Copy a single file from the pod to the local directory.
  • oc rsync <pod-name>:/remote/dir ./local/dir: Copy the directory from the pod to the local directory.

  • oc rsync <pod-name>:/remote/dir/. ./local/dir: Copy the contents of the directory from the pod to the local directory.

  • oc rsync <pod-name>:/remote/dir ./local/dir --delete: Copy the contents of the directory from the pod to the local directory. The --delete option ensures that the resulting directories will match exactly, with directories/files in the local directory which are not found in the pod being deleted.

  • oc rsync ./local/dir <pod-name>:/remote/dir --no-perms: Copy the directory to the remote directory in the pod. The --no-perms option ensures that no attempt is made to transfer permissions, which can fail if remote directories are not owned by the user that the container runs as.

  • oc rsync ./local/dir <pod-name>:/remote/dir --exclude=* --include=<filename> --no-perms: Copy the single file to the remote directory in the pod. The --no-perms option ensures that no attempt is made to transfer permissions, which can fail if remote directories are not owned by the user that the container runs as.

Would You Like to Learn More?

This post is based on one of OpenShift’s interactive learning scenarios. To try it and our other tutorials without needing to install OpenShift, visit the OpenShift Learning Portal

Do you have an OpenShift Online account? There's no reason to wait. Get your applications running in minutes with no installation needed. Sign up for the free trial of OpenShift Online.

What other topics would you like to see in the future on this blog? We're happy to make tutorials about anything that helps you with your OpenShift experience. Comment and let us know!

{{cta('1ba92822-e866-48f0-8a92-ade9f0c3b6ca')}}