vsupalov

Access Private Repositories from Your Dockerfile Without Leaving Behind Your SSH Keys

How to clone from a private repository while building your Docker image without leaking your private SSH key.

[ docker ]

One great boon of using Docker, is the ability to clone a repository, set necessary variables and build your app with a single command. The resulting image should contain exactly everything which is needed to execute or develop the project.

What about building images for live environments? You probably want to make them ready-to-run by baking the binaries or code into them, and don’t leave behind unnecessary parts. If you add a file during an image build, and then delete it in another one, the file still stays inside the image! Are you doing something like the following in your Dockerfile?

ARG SSH_PRIVATE_KEY
RUN mkdir /root/.ssh/
RUN echo "${SSH_PRIVATE_KEY}" > /root/.ssh/id_rsa
# [...]
RUN rm /root/.ssh/id_rsa

Sure, you’re deleting the file, but it’s still in one of the layers of the image you’ll push. If you would build such an image, your secrets would be very easy to access. A more important issue: the ARG variable value will be visible as soon as somebody types docker history IMAGE_ID even if it was not set using the default value. It’s not recommended to use build-time variables to pass secrets.

How to build your Docker image, using an SSH key to clone a private repository, and not leave unnecessary information behind? There are several options.

Squashing

In Docker 1.13, a new --squash parameter was added. It can be used to reduce the size of an image by removing files which are not present anymore, and reduce multiple layers to a single one between the origin and the latest stage. You’ll need to run the daemon with experimental features enabled to use it.

This has also the convenient side effect, of removing files which were used but were deleted for a reason - such as a file containing secret information which you don’t want to have around anymore. You use it when executing a docker build:

$ docker build --squash [...]

This approach works, but it has a few potential downsides. If you make a mistake and push an image without squashing it, you risk leaking the thing you wanted to keep private. You will still see ARG values when viewing the history of the image. Also, you’re not making use of Docker layer caching as much as you could. There is a more elegant way by now - multi-stage builds.

Multi-stage Builds

When working with multi-stage builds, you are building multiple Docker images in a single Dockerfile, but only the last one is the real result. The other ones are there to support it, and don’t leave any traces behind in the final image to be built.

This is really convenient for handling secrets! You simply provide your private SSH key to one of the intermediate images, use it to install dependencies, download the data or clone a Git repository, and pass directories containing that data into your final image build process, while leaving the secret credentials safe and sound in the intermediate image.

You should create dedicated SSH deploy keys for such tasks, which are disposable, can be disabled easily and don’t need a passphrase to unlock.

Here is an example of a multi-stage Dockerfile:

# this is our first build stage, it will not persist in the final image
FROM ubuntu as intermediate

# install git
RUN apt-get update
RUN apt-get install -y git

# add credentials on build
ARG SSH_PRIVATE_KEY
RUN mkdir /root/.ssh/
RUN echo "${SSH_PRIVATE_KEY}" > /root/.ssh/id_rsa

# make sure your domain is accepted
RUN touch /root/.ssh/known_hosts
RUN ssh-keyscan bitbucket.org >> /root/.ssh/known_hosts

RUN git clone git@bitbucket.org:your-user/your-repo.git

FROM ubuntu
# copy the repository form the previous image
COPY --from=intermediate /your-repo /srv/your-repo
# ... actually use the repo :)

The SSH_PRIVATE_KEY is passed when issuing the build command with --build-arg or in the build block of your docker-compose.yml file. As it is not used in the final image, the value will not be available using the history command. For a better overview of using variables when handling your Docker workflows, read this in-depth guide.

Using multi-stage builds, also has the great side effect of significantly reducing the size of your final Docker images, as they don’t need to contain traces of Git and other build tools if used correctly.

Master Docker ARG and ENV in 5 Days

Sign up to the free 5-day email course, and learn all you need to know about using environment and build-time variables with Docker. Get actionable advice, best practices and nifty tricks to use for your next project.