Introduction to container orchestration

A single container host by itself is like a tree without a forest.

That lone host gives you a useful way to abstract your development and production environments from what’s underneath. But on its own, it gives you only a glimpse of its promise.

The big stories in the container world are now all about orchestration. Docker Swarm, Kubernetes, Mesos/Marathon and others offer frameworks for managing container hosts in concert.

But where to start?

In this guide we’ll look at what each of the major container orchestration tools has to offer and why you’d choose one or the other.

What is container orchestration?

For the longest time, deploying an application into production was as much ritual as it was science.

Deployment involved ugly bash scripts with as many if statements as there were corner cases, workarounds and “don’t ask why, it just has to be like that” situations. Coordinating it all was a gnarled sysadmin, and maybe a DBA, who’d jealously guard and devotedly follow the rites required to get code into production.

Then came Chef, Puppet, Ansible and continuous integration and deployment. They made it easy to standardise testing and deployment. Importantly, once in place they allow developers and devops people to forget about the detail of what needs to happen.

Similarly, containers allow us to standardise the environment and abstract away the specifics of the underlying operating system and hardware. You can think of container orchestration as doing the same job for the data centre: it allows us the freedom not to think about what server will host a particular container or how that container will be started, monitored and killed.

Container orchestration is the big fight of the moment. While the container format itself is largely settled, for now, the real differentation is in how to deploy and manage those containers.

How to evaluate container orchestration solutions (Docker Swarm vs Kubernetes vs Mesos and Marathon)

Even though they all do “container orchestration”, each solution’s approach and features vary enough that comparing them is best thought of as areas of Venn-diagram-like cross-over.

The three key differentiators to look out for are:

  • Level of abstraction: does it deal in containers or services that happen to be container-based?
  • Tooling: how do you manage the orchestration and how well does it integrate with other services?
  • Architecture: how does it cope at scale and in the face of failure?

It’s also worth considering how opinionated each framework is: are you willing to buy into an entire philosophy or do you want something that will unobtrusively support your existing approach?

Docker Swarm

Swarm is Docker’s own container orchestration tool. It uses the standard Docker API and networking, making it easy to drop into an environment where you’re already working with Docker containers.

Key concepts

A Swarm consists of manager and worker nodes that run services. Let’s look at what each of them does:

  • Managers: distribute tasks across the cluster, with one manager orchestrating the worker nodes that make up the swarm.
  • Workers: run Docker containers assigned to them by a manager.
  • Services: an interface to a particular set of Docker containers running across the swarm.
  • Tasks: the individual Docker containers running the image, plus commands, needed by a particular service.
  • Key-value store: etcd, Consul or Zookeeper storing the swarm’s state and providing service discoverability.

Things to look out for

Swarm is remarkably straightforward to set-up: use Docker Engine to initalise the master node(s) and it provides you with a command you can use on each worker node to add them to the cluster.

Once the swarm is running, you specify services using Docker Compose. When you bring those services up they are deployed across the hosts of the swarm rather on a single host. Any networks you define will also work across the swarm.

Swarm is somewhat modular: as mentioned above, you can swap out the kv store and there are experiments in using Mesos as an alternative scheduler. However, it goes without saying that by using Swarm you are buying into the Docker way. For many that will be a plus but it’s worth considering that not only is Swarm container-centric but it is Docker-centric.

When to use Docker Swarm

Docker Inc is investing a lot of effort into Swarm and so things are changing rapidly. Until now, at least, Swarm has been viewed as suitable for experiments and smaller scale deployments. That, perhaps, was due to limitations of early Swarm releases and it remains less proven than Kubernetes and Mesos.

Docker Swarm gives you an easy way to move into container orchestration without leaving behind the familiarity of existing Docker tools and thinking.

Kubernetes

Kubernetes is based on Google’s experience of running workloads at huge scale in production over the past fifteen years. It’s not an open sourcing of Borg, their internal container orchestration system, but draws on lessons Google have learned from running Borg.

Where Docker Swarm extends single host Docker, Kubernetes’ starting point is the cluster itself.

Key concepts

If you’re considering Kubernetes, now’s the time to think about how willing you are to step away from the Docker way.

Here’s what makes up a Kubernetes cluster:

  • Master: by default, a single master handles API calls, assigns workloads and maintains configuration state.
  • Minions: the servers that run workloads and anything else that’s not on the master.
  • Pods: units of compute power, made-up of one or a handful of containers deployed on the same host, that together perform a task, have a single IP address and flat networking within the pods.
  • Services: front end and load balancer for pods, providing a floating IP for access to the pods that power the service, meaning that changes can happen at the pod-level while maintaining a stable interface.
  • Replication controllers: responsible for maintaining X replicas of the required pods.
  • Labels: key-value tags (e.g. “Name”) that you and the system use to identify pods, replication controllers and services.

Kubernetes starts big. Whereas Docker Swarm starts at the container and builds out, Kubernetes starts at the cluster and uses containers almost as an implementation detail.

Things to look out for

Already, you can see that Kubernetes diverges somewhat from what you might be used to with Docker alone. The CLI and API are different to Docker’s but the momentum behind Kubernetes means that there’s no shortage of language-specific libraries and other integrations.

Modularity is fundamental to the Kubernetes approach. For example, you can choose from Flannel, Weave, Calico and other networking options. You can also swap out the stock scheduler and use Mesos instead.

That modularity extends to the containers themselves. Kubernetes is not a Docker-only show. While that might be inconvenient if you’re already familiar with Docker tooling, it does leave open the possibility to use rkt and other container formats.

When to use Kubernetes

Kubernetes is well suited to medium to large clusters running complex applications. In particular, if you’re already thinking in terms of multiple sets of stateless microservices, then Kubernetes gives you a framework to establish the rules of interaction between them and then run the show.

For HA, you’ll need some extra work to establish multiple masters.

If you’re running stateful apps such as a database then, until recently, you’d have needed to look elsewhere. Let’s put aside the debate over whether stateful apps in containers are a good idea: Kubernetes 1.3 introduces support for “pet”-like containers that have configuration dependencies and need stateful failover.

The learning curve and set-up will take more effort than Docker Swarm but the pay-off comes when you need the flexibility of its modularity and the suitability to larger scale deployments.

Mesos and Marathon

Apache Mesos pre-dates Docker and is described as a distributed systems kernel: in other words, it presents a single logical view of multiple computers. In more concrete terms, it’s a cluster manager that makes computing resources available to frameworks. It’s those frameworks that deal with the specifics of what runs on the Mesos cluster.

Marathon is one such framework and it specialises in running applications, including containers, on Mesos clusters.

Together, Mesos and Marathon offer an equivalent to Kubernetes, while also allowing you to run non-containerised workloads alongside containers.

Key concepts

Mesos operates two levels of scheduling: a Mesos slave reports to the master to say that it has some free resource. Mesos then offers that resource to a framework, based on a previously set policy.

The lucky framework then offers that resource to whatever processeses it looks after, again according to an appropriate policy.

There could be multiple frameworks running alongside each other on top of Mesos, including Marathon for containers but also Storm or Spark. It’s even possible to use Marathon to deploy other frameworks, but let’s leave that to one side.

Here’s what makes up our Mesos cluster:

  • Masters: Zookeeper manages a minimum of three master nodes and enables high availability by relying on a quorum amongst those nodes.
  • Slaves: these nodes run the tasks passed down by the framework.
  • Framework: Mesos itself knows nothing about the workloads, whereas specalist frameworks decide what to do with the resources offered to them by Mesos.

On top of Mesos, Marathon then offers a highly available framework delivering:

  • Service discovery: through a dedicated DNS service, as well as other options.
  • Load balancing: through HAProxy.
  • Constraint management: to control where in the cluster certain workloads run, maintain a set level of resources for those workloads, enable rack awareness and other constraints.
  • Applications: the long running services you want to run on the cluster; may be Docker containers but can also be other types of workload.
  • REST API: deploy, alter and destroy workloads.

Things to look out for

Both Mesos and Marathon are designed to scale to hundreds of thousands of nodes. If you have a relatively small cluster, then perhaps the resource and admin overhead of running such a large system isn’t appropriate. It also makes set-up more complex, especially when compared to Docker Swarm.

Working with Mesos and Marathon will also be somewhat different to either Swarm or Kubernetes. It’s another set of tooling and APIs to learn.

When to use Mesos and Marathon

Mesos is proven at tens of thousands of nodes in production. Project sponsor Mesosphere says that there are Marathon clusters with thousands of nodes in production. This is where both Mesos and Marathon gain from being older than either Kubernetes or Swarm: they’ve had time to knock off the rough edges and to be proven in long-running real-world situations at massive scale.

Mesos and Marathon are, perhaps, a good choice if you need to run non-containerised workloads alongside containers and you want the reassurance of something that is proven at very large scale. The trade-off versus Swarm is that it’s harder to get started and there’s a steeper learning curve. The trade-off against Kubernetes is partly more to do with personal appetite for risk but Kubernetes has been designed from the start as a container orchestration tool.

Conclusion

The three main options right now vary considerably in implementation and in how you interact with them. If you have to remember just one key piece of information to help you pick Docker Swarm vs Kubernetes or Masos with Marathon vs another solution, it should be that each solution will have some positive and negative things. Docker’s Swarm gives you the easiest route into orchestrating a cluster of Docker hosts. Kubernetes is container-centric but focuses less on the containers themselves and more on deploying and managing services. Mesos with Marathon promises huge scale but introduces additional complexity.

There are other options out there too. For example, Rancher and Nomad both offer different takes on service orchestration.

Ultimately, which option you take will depend on the scale you want to achieve and which ecosystem you feel most comfortable in.