Access Private Repositories from Your Dockerfile Without Leaving Behind Your SSH Keys
If you’re not careful, your secrets will leave traces inside of your Docker image.
Update: there’s a new, convenient way to give your building Docker image access to a private Git repository with BuildKit. Check it out!
If you copy over your private SSH key into the image during the build to clone a private Git repository, it might stick around. If you add a file during an image build, and then delete it in another one, the file still sticks around in the final image!
Are you doing something like the following in your Dockerfile?
ARG SSH_PRIVATE_KEY
RUN mkdir /root/.ssh/
RUN echo "${SSH_PRIVATE_KEY}" > /root/.ssh/id_rsa
# [...]
RUN rm /root/.ssh/id_rsa
Even though you’re deleting the file, it still can be viewed in one of the layers of the image you’ll push.
Also, the ARG variable value will be visible as soon as somebody types docker history IMAGE_ID
even if it
was not set using the default value. It’s not recommended to use build-time variables to pass secrets.
Want to learn more about Docker ENV and ARG? Check out this in-depth guide.
How to build your Docker image, using an SSH key to clone a private repository, and not leave unnecessary information behind? There are several options.
Squashing
In Docker 1.13, a new --squash
parameter was added. It can be used to reduce the size of an image by removing
files which are not present anymore, and reduce multiple layers to a single one between the origin and the latest
stage. You’ll need to run the daemon with experimental features enabled to use it.
This has also the convenient side effect, of removing files which were created and then deleted. Handy to get rid of files containing secret information which you don’t want to have around anymore. You tell docker to squash away layers when executing docker build:
$ docker build --squash [...]
This approach works, but it has a few potential downsides. If you make a mistake and push an image without squashing it, you risk leaking the thing you wanted to keep private. You will still see ARG values when viewing the history of the image. Also, you’re not making use of Docker layer caching as much as you could. There is a more elegant way by now - multi-stage builds.
Multi-stage Builds
When working with multi-stage builds, you are building multiple Docker images in a single Dockerfile, but only the last one is the real result. The other ones are there to support it. Anything but the final image don’t leave any traces.
This is really convenient for handling secrets! You simply provide your private SSH key to one of the intermediate images, use it to install dependencies, download the data or clone a Git repository, and pass directories containing that data into your final image build process, while leaving the secret credentials safe and sound in the intermediate image.
As a sidenote: it’s a good idea to create dedicated SSH deploy keys for such tasks, which are disposable, can be disabled easily, only have read-access and don’t need a passphrase to unlock.
Here is an example of a multi-stage Dockerfile:
# this is our first build stage, it will not persist in the final image
FROM ubuntu as intermediate
# install git
RUN apt-get update
RUN apt-get install -y git
# add credentials on build
ARG SSH_PRIVATE_KEY
RUN mkdir /root/.ssh/
RUN echo "${SSH_PRIVATE_KEY}" > /root/.ssh/id_rsa
# make sure your domain is accepted
RUN touch /root/.ssh/known_hosts
RUN ssh-keyscan bitbucket.org >> /root/.ssh/known_hosts
RUN git clone git@bitbucket.org:your-user/your-repo.git
FROM ubuntu
# copy the repository form the previous image
COPY --from=intermediate /your-repo /srv/your-repo
# ... actually use the repo :)
There are two images defined here. One of them is named “intermediate”, the final one doesn’t have a name. The “intermediate” image is referenced, and we’re copying the repository data over from it into the final image.
The SSH_PRIVATE_KEY
is passed when issuing the build command with --build-arg
or in the build
block of your docker-compose.yml file. That ARG variable is not used in the final image, the value will not
be available using the history command.
Using multi-stage builds also has the great side effect of significantly reducing the size of your final Docker images, as they don’t need to contain traces of Git and other build tools if used correctly.
In Conclusion
I hope it helped you realized where you might be exposing confidential information without realizing it. Depending on your Dockerfiles, you can use one of the presented ways to deal with potentially-leaking secrets while building your Docker images: squashing and multi-stage builds.
If you don’t mind using BuildKit, check out how you can make it even easier on your self to deal with SSH credentials and build-time secrets!
If you are not yet confident regarding how to configure your Docker image builds or dockerized apps, check out this in-depth guide about ARG, ENV, env_files and docker-compose .env.