YAML for Docker Compose and Kubernetes Config Files
When you’re getting into the Docker ecosystem, or are starting to work with Kubernetes, you’ll get to look at lots of files in YAML format. The YAML format is meant to be human-readable and convenient to type. And it is. It’s the popular kid compared to JSON or XML formats for a reason.
It’s also pretty versatile, so you can express stuff in many different ways. When you read docs, you’ll see people refer to it as YML, files can end in either “.yml” or “.yaml”. All through this format, lots of small variations can be observed, and that’s totally cool and accepted.
I tend to be confused by this kind of plenty. My curiosity be damned. It took me a while to adapt, learn about some of the quirks and stop going “huh” every time I had to deal with a file written by yet another author. I took some time to compile a non-exhaustive collection of stuff which is good to know, general observations and things I got used to, which might be a bit confusing to somebody starting out. Some of those can be considered pet-peeves.
I hope that seeing in a controlled environment, and all in one place, will help you, save some time and Google-lookups in the future.
As supporting evidence and for further investigations on your own, you can check out the Yaml spec or the Docker compose file reference. Especially the example compose file is an interesting specimen.
The File Ending
Let’s start simple and obvious. “.yml” and “.yaml” both work. The first one is more prevalent. Source: this Stack Overflow discussion.
Tabs And Spaces
Coming from Python, I always prefer spaces everywhere. Except for Makefiles, and other occasions where it’s socially required and where you can’t get around them. Mixing tabs and spaces can result in some VERY weird errors and issues when working with YAML config files. Stick to one of those for indents, and be zealous about it.
(Thanks to /u/zerotimestatechamp for pointing this one out!)
The Strings
version: "3"
version: '3'
# multiline string
# most new lines get replaced by spaces
annot: >
a string written
in folded style
# will become "a string written in folded style\n"
Both work. And in most cases both ways of writing strings are equivalent for most practical reasons.. The only difference, is when weirdness starts to occur. You see, double-quoted strings are capable of expressing arbitrary strings through escape sequences.
Keys, Values and Blocks
YAML files are made up of keys which are used to access assigned values.
A key can have a single value like an integer (5), a string (“hi”), a list (“hi”, “there”), or a dictionary (set of key-value mappings).
There are colons behind key names. Indentation matters. Having spaces in lines below a key makes those the content of the key’s block. It’s a way of writing lists and dictionaries in a more readable fashion. Here are some examples of the values you’ll see in the wild:
# a single, boring value
immastring: "3"
immanint: 3
# a list
immalist:
- "hi"
- "there"
# also a list
aimmalist2: ["hi", "there"]
# a dictionary
immadict:
akey: "a value"
anotherkey: "another value"
immadict2: {akey: "a value", anotherkey: "another value"}
Those can be combined, which I found really confusing at first because all the space stuff seems to be broken at first glance. It looks weird. Behold, a list of dictionaries:
weirdlist:
- key1: "hi"
key2: "there"
- key1: "hi2"
key2: "there2"
# You could write it like this
weirdlist: [{key1: "hi", key2: "there"}, {key1: "hi2", key2: "there2"}]
Here’s a list of lists, just for kicks:
listlist:
- - oh
- why
- - please
- stop
Case Study: command in docker-compose.yml
Let’s get specific. The ones above are due to YAML. Docker compose makes it possible to be even more flexible, and provides even more ways to do stuff. Not judging. All of those are handy sometimes.
command: bundle exec thin -p 3000
command: ["bundle", "exec", "thin", "-p", "3000"]
command:
- bundle
- exec
- thin
- -p
- 3000
An important detail here is, that those are not necessarily equivalent. You can have weird bugs in one notation, while another will work just fine. It depends on the semantics, but you will encounter all of the above ways of writing a command.
Case Study: labels in docker-compose.yml
One more example, are ways to set labels. It’s taken directly from this Docker Compose reference section.
labels:
com.example.description: "Accounting webapp"
com.example.department: "Finance"
com.example.label-with-empty-value: ""
labels:
- "com.example.description=Accounting webapp"
- "com.example.department=Finance"
- "com.example.label-with-empty-value"
You see the pattern here.
Case Study: environment in docker-compose.yml
Those two blocks do the same. One uses a dict, the other a list:
environment:
hi: development
there: 'true'
# wow is taken from the host env
wow:
environment:
- hi=development
- there=true
- wow
You may have noticed that the ’true’ is quoted in the dict. That’s important. You have to quote true, false, yes, no, on, off in this case.
If you had something like:
- BLA="hi"
It would be parsed as a single string, not stripping the double quotes.
What to Quote
A common opinion is, that you don’t need quotes, except if you’re using special characters or for special terms. Source.
The answer given here however, is not true for docker-compose files where colons work perfectly fine without quotes. You can also get away with not-quoting “true” sometimes, despite the docs stating that you very-much should. (see here). Okay, so it’s a case-to-case thing I guess?
It’s not a major issue, despite the confusion. Most things work as intended, and if not you notice quickly. The multiple layers of parsing and interpretation make it a bit non-intuitive. The information here is not quite conclusive. From the look of it, you’re on the safe side quoting stuff you don’t feel comfortable with. But you can always give it a go and see if there’s a problem.
Here are the results of me looking for some patterns of what to quote with little limited success:
NOPE:
referring to other labels
numbers. Except I've seen floating points quoted - is there a precision reason?
(cpus: '0.001')
durations never are
10s
paths
image: dockersamples/visualizer:stable
# despite special characters "[]=", those below work without quotes
constraints: [node.role == manager]
constraints:
- node.role == manager
- engine.labels.operatingsystem == ubuntu 14.04
YEP:
boolean values: true, false, yes, no, on, off (but not always)
weird strings with an escape sequence (sure about this one, whoo!)
SOMETIMES:
volumes
- "/var/run/docker.sock:/var/run/docker.sock"
- /var/lib/mysql
ports and mappings
"8080:8080"
5001:80
"6379"
It gets weirder when you start using escape sequences, and bash variables. Let’s not go there here.
In Conclusion
So, that was a quick glimpse of different ways to structure information in YAML files when dealing with Docker Compose or Kubernetes config files.
There are different ways of doing things, and it can get confusing if you’re not aware that there are different possible way to provide the same information, depending on the notation and content.
The above examples are not exhaustive. Also, it’s a non-judging collection of “this is what you’ll find” kind of overview, by no means is any particular option suggested over another.
Hope that helps you to spend less time looking stuff up, and getting on with building cool, pragmatic infrastructure. Happy Dockering!