Docker Usage in 'The Flask Mega-Tutorial' by Miguel Grinberg - A Review

Remember when you first learned about virtualenv?

Let me guess - you read about it, thought something like “huh that’s neat”, and started using it right away, creating a new ‘env’ directory right next to your code and adding it to .gitignore eventually.

In time, you transitioned towards using .venv files, virtualenvwrapper and storing your virtualenvs in a dedicated directory like ‘~/.envs’, far from your codebase.

You only started learning about best practices and improving your workflows later on, once you got comfortable and virtualenv became a part of your toolset.

During your first encounter with a new tool, the goal usually isn’t to:

  • Learn everything there is to know about it.
  • Use it in the most complicated way possible.
  • Master it just because, and as soon as possible.

The way get to know a new tool will differ once you understand it better. There’s no point in trying to skip over the total-beginner phase.

Let’s Talk About Docker

Are you confident with Docker already, and happy with your current workflows?

Going from a first tutorial to dealing with best practices & hidden gotchas takes time and a lot of ‘doing’.

Looking back at the first tutorials, it’s hard to tell what parts were meant to help you ‘get it’, and which parts are something advanced users stick to.

Here are the questions I’d like to help you answer in this article:

  • What are next steps with Docker once you work through the first few tutorials?
  • What patterns are good and should be used after the learning experience?
  • Which ones will be bad for you going forward?

In my opinion, it’s best to learn about practical matters from examining real-world examples.

To provide some focused and useful answers, I’d like to review the Docker usage of a popular tutorial series, and provide in-depth commentary on Docker-related workflows and deployment topics.

Once you finish reading this article, you’ll know what usage patterns to look out for in your projects and you’ll have actionable next steps to start bridging the gap between a beginner and an advanced user.

Let’s Look At: The Flask Mega-Tutorial

When you’re just starting out with Flask and Python for web development, ‘The Flask Mega-Tutorial’ by Miguel Grinberg is the way to go.

It’s an incredibly in-depth and detailed step-by-step series of articles about Flask, best practices and useful tricks for everything you’ll want to do in the beginning.

The Tutorial Touches On Working With Docker

One of the (to date) 23 Chapters of the tutorial is titled “Deployment on Docker Containers”.

It’s a huge article. Leading the reader from “understanding what containers are for”, to using Docker for the first time with a Flask project.

Here is a list of the topics which are introduced in the chapter:

  • The basics around containers
  • Installing Docker
  • Understanding Dockerfiles
  • Writing a Flask-friendly Dockerfile
  • Custom entrypoint scripts
  • Using the Docker CLI
  • Building your first Docker image
  • Passing variables to Docker containers
  • Running a container, and making sure it’s removed afterwards
  • Running more containers for backing services
  • Service-specific configurations
  • Connecting to possibly unavailable services during container startup
  • A brief glimpse of running Docker on actual providers

Wow. That’s a lot, right? You could write whole articles about each of those points.

Going from installing Docker to building your first image and running your first container is an enormous amount of material and progress.

It’s a Great Practical Introduction to Docker

Miguel does a great job introducing concepts, and leading the reader step-by-step towards having a functional dockerized deployment prototype.

The tutorial chapter is a cool crashcourse about using Docker for deploying a Flask application. Of course it’s not exhaustive, and does not go into all possible details.

The balance between learning and doing it right is somewhat on the “learning” side, which is completely alright. After all, you can only introduce so many new things and concepts at once.

A Good Starting Point - What Now?

Let’s look at the Docker usage in this tutorial in-depth.

There are parts about the workflows which I really like. Small details which will have a big impact towards making your work with Docker nicer.

After the good patterns, I’d like to help you spot the ones which won’t live up to a non-tutorial context. Some will cause minor clutter, others will come back to bite you when you least expect it. All are fixable.

Stuff You Should Totally Do

The Dockerfile introduced in the tutorial is very solid. It does a lot of things right.

You can see it in the “Building a Container Image” section of the original article.

If you’re mindful of them all in your project, you’re doing most things right.

Project files aren’t copied into the filesystem root.

WORKDIR /home/microblog

This is a minor thing, but I find it to be in very good taste.

It’s pretty easy to litter your / folder with files or new directories. You wouldn’t want to do it to a Linux machine, so why do it to your images.

The Dockerfile introduces a non-root user.

RUN adduser -D microblog
# ...
RUN chown -R microblog:microblog ./
USER microblog

Those lines create a new user, make sure it owns the copied files and that the entrypoint command will run as that user.

You don’t want to run your containerized apps as root. Most people do.

The Dockerfile is based on Alpine.

The image size of Docker containers matters. Well, sometimes at least. Alpine is a pretty small base image, and will work for most apps.

Having a smaller image should help your containers to be pulled quicker, and start up with slightly less delay. Also, it’ll be easier on the memory footprint.

Of course there’s trade-offs - sometimes you might want to add one or two tools which could help with debugging issues eventually, but usually you won’t need them.

Using virtualenv within the Dockerfile.

You might think that having a Dockerfile already is enough to make sure that your app’s dependencies are safe from the usual bad influences.

This goes hand-in-hand with the notion that there’s no use for Docker if you are using virtualenv already.

I think Docker and virtualenv compliment each other nicely, and are best used together when working with Python. You get additional control over your environment and reduce the chances for weird issues to surface.

Docker can be used to install system packages which will be required to build your virtualenv.

Virtualenv makes it very easy to make sure that a particular Python version is used, regardless of the OS you’re on. Also, to be very certain that the module versions you need and only them are available. Depending on your host OS, you could be making wrong assumptions otherwise.

The requirements.txt file is copied first.

Before the whole code of the project is added to the new image, the following lines are executed:

COPY requirements.txt requirements.txt
# ...
RUN venv/bin/pip install -r requirements.txt

This is a very good way to reduce future image build times.

As long as your requirements file does not change, building a new image will not require you to download and rebuild your virtualenv dependencies.

The Docker caching mechanisms realize that the requirements.txt file is the same, skip the step and steps which would have depended on it.

The entrypoint command is written in exec form

There’s a difference between writing your entrypoint command as a list, or as a string. The first one is the exec form and preferred, the other one is the shell form.

# shell form:
# ENTRYPOINT ./boot.sh

# exec form:
ENTRYPOINT ["./boot.sh"]

The shell form will spawn an additional shell in your container, which is usually not needed.

These Will Cause Trouble

Once again, the tutorial article does a great job of introducing Docker, and leading the readers through their first steps of getting to know the tool. It’s pretty cool, and there’s nothing wrong with it.

However, if you want to go deeper, learn more and continue using Docker beyond the tutorial context, you might want to think about next steps and re-evaluate practices which can be improved on.

Here’s a few things to watch out for, and to help you improve your workflows without having to stumble upon them one-by-one when they cause trouble.

The shell scripts are ignoring failing commands.

Shell scripts don’t care if one of the commands fails. They simply look the other way and execute the next one.

That’s fine as long as everything goes well, but can cause trouble when something doesn’t work out. Even worse: you won’t know about the error, which can be quite dangerous.

When working with bash scripts, mine usually start with these lines:

#! /bin/bash

set -euo pipefail

The set command, tells bash to exit the script when one of the commands in it exits with a non-0 state (-e), to fail if there are uninitialized variabes (-u) and to also exit if the non-last command in a piped sequence fails (-o pipefail).

As the scripts in the article rely on /bin/sh and don’t use pipes, I’d suggest them to start this way:

#! /bin/sh

set -eu

You can read more about those flags in this excellent article.

There are lots of command-line variables.

You don’t want to type out an epic poem every time you run a container.

Maybe the first time, to try it out and get a quick win. As soon as you’re doing it multiple times, you’re going to be slightly annoyed at best.

Lots of command-line arguments means there is room for all kinds of mistakes and accidents to happen.

Sure, you could put the command in a script (or Makefile), and that’s fine. However, you can do better.

One way would be to use an env_file. This way, you can set multiple environment variables at once and would also help you to get your secrets out of the command line:

# this:
$ docker run --name microblog -d -p 8000:5000 --rm -e SECRET_KEY=my-secret-key \
    -e MAIL_SERVER=smtp.googlemail.com -e MAIL_PORT=587 -e MAIL_USE_TLS=true \
    -e MAIL_USERNAME=<your-gmail-username> -e MAIL_PASSWORD=<your-gmail-password> \

# becomes this:
$ docker run --name microblog -d -p 8000:5000 --rm \
    --env-file=env_vars \

For the above command to work, I created a new “env_vars” file in the working directory, containing lines like this:

# ...

Another way, would be to use an .env file with docker-compose, or to pass parameters from the host directly to the container. You can read more about working with Docker environment variables in this giant guide.

Handling multiple containers.

You’ll want to bring up multiple containers, which are supposed to work together, quite often. It’s good, clean fun.

The original article walks you through building the images and then starting one container after the other, providing each with the right variables and naming them the right way.

Handing containers one-by-one is really tedious business. It takes away lots of attention and requires fiddly commandline action.

Docker Compose was made to make it easier to handle a bunch of interdependent containers at once.

The stack is defined in a docker-compose.yml file which, among others, specifies:

  • Different “services” (aka containers).
  • Their environment variables.
  • What Docker image to use.
  • Where to find the Dockerfile in case the image needs to be built first.
  • Many other things - basically anything you can specify using the Docker CLI.

It’s really neat. You can read more here or look here for a complete reference of the file format.

A problem you’re not experiencing yet: hard-coded values in your docker-compose file.

Once you start using docker-compose, you might find yourself hard-coding lots of values in your docker-compose.yml file.

Especially if they’re secrets, that’s not a thing you’d want to commit to a Git repo.

With docker-compose, you could use a .env file, to substitute variables in your docker-compose.yml file when you use it with the ones from the file.

Otherwise, you could just pass environment variables from the host, as you can with Docker.

Don’t use the :latest tag for deployments.

In the tutorial, images are built and containers run using the :latest Docker tag.

That’s okay for a development environment, but you’ll have a bad time relying on the :latest tag for anything resembling deployment.

:latest tends to cause confusion, issues and unnecessary work. You can read more about it here.

Instead, a better approach would be to tag your Docker images with the Git commit hash, to make sure that they are unique, reproducible and don’t get the chance to cause mischief in your deployment pipeline.

You might want to look into automating deployment workflows.

Manual tasks which you’re performing frequently should be automated.

Especially tasks around your deployment processes. It’s easy to forget a step, or make a mistake. It’s easy to write a script which won’t.

One of the best things about Docker, is the fact that it’s a great building block for automated deployment workflows.

If you’re starting out, take a look at Fabric. See if you’re doing some tasks manually (even on remote SSH machines) which could be automated with a simple script.

Functional automation is a blessing when you’re getting into deployment tasks, and a good first step towards building a deployment pipeline down the line. You can read more about building a deployment pipeline in this article.

Bonus: consider using Docker for your development environment

This one isn’t a hidden issue, but rather another way to use Docker, which wasn’t apparent from the tutorial.

Spinning up backing services in Docker containers for your development environment is pretty neat.

You can write a simple docker-compose.yml file, to make sure that your database of choice, Redis, and any other service you need will be available with a single command.

This way, you can work on multiple projects with difference backing services on your machine, without them getting in the way of each other.

One thing I’d advise against, is trying to dockerize the actual app you’re developing. While great for deployments, developing inside of Docker containers is not a good fit for most people.

This is more likely to make your workflows less fluent, take up time which is better spent on other tasks, introduce unnecessary delays if done incorrectly and cause complications when using other tools, while not contributing much in return.

That’s it! On To Next Steps :)

Alright! This was a really long list, looking into the details around Docker usage in ‘The Flask Mega-Tutorial’ by Miguel Grinberg,

I hope you’ve seen one or two things which you were missing until now, and will be able to use the information to learn faster and improve the way you use Docker!

If you want to learn more about working with Docker, deployment pipelines and similar topics, here’s a list of cool resources you can check out next: