Scale Testing Docker Swarm to 30,000 Containers

Andrea Luzzardi

Nov 16 2015

1,000 nodes, 30,000 containers, 1 Swarm manager

Swarm is the easiest way to run Docker app in production. It lets you take an an app that you’ve built in development and deploy it across a cluster of servers. Recently we took Swarm out beta and released version 1.0. It’s being used by people like O’Reilly for building authoring tools, the Distributed Systems Group at Eurecom for doing scientific research, and Rackspace who built their new container service, Carina, on top of it.

logo+title-final-swarm-2d copyBut there’s an important thing that Swarm needs to be able to do to take your apps to production: it needs to scale. We believed Swarm could scale up tremendously, so we looked around for a benchmark and found one here. We decided to recreate the Kubernetes test with Swarm. Like the team at Google, we wanted to make sure that as we launched more containers it would keep scheduling containers quickly.

What did we measure?

We wanted to stress test a single Swarm manager, to see how capable it would be, so we used one Swarm manager to manage all our nodes. We placed fifty containers per node. Commands were run 1,000 times against Swarm and and we generated percentiles for 1) API Response time and 2) Scheduling delay. We found that we were able to scale up to 1,000 nodes running 30,000 containers. 99% of the time each container took less than half a second to launch. There was no noticeable difference in the launch time of the 1st and 30,000th container.

We used docker info to measure API response time, and then used docker run -dit ubuntu bash to measure scheduling delay.


• Discovery backend: Consul
• 1,000 nodes
• 30 containers per node
• Manager: AWS m4.xlarge (4 CPUs, 16GB RAM)
• Nodes: AWS t2.micro (1 CPU, 1 GB RAM)
• Container image: Ubuntu 14.04


Percentile API Response Time Scheduling Delay
50th 150ms 230ms
90th 200ms 250ms
99th 360ms 400ms

We’ll continue to test Swarm, pushing its limits and using the results to harden it. If you want to test it out yourself, we’ve released our swarm-bench code on GitHub. If you want to learn more about Swarm, check out the documentation, and go to our forums if you need any help.


Learn More about Docker



15 thoughts on “Scale Testing Docker Swarm to 30,000 Containers

  1. Great!
    So, Docker scales really very well, near to linear-sacling 😉

  2. Maybe someone could explain then, how can I share this swarm manager between multiple users in my team? I mean all of them want to push some containers out there.. docker-machine seems to be unable to use custom ssh key, or ssh keys from digital ocean profile.. sharing ~/.docker/.. ssh generated key is insane.. any solution for that?

  3. I was at DockerCon and saw this demonstrated with a great visual representation of all the nodes and containers in the Swarm cluster in a circle. I was wondering what this visualisation was and if we could use it in some of our demos. It may have been Mesosphere but I’m looking for something for my own on premise Swarm.

    • Martin,

      I agree having a nice visual to demonstrate Swam to a less technical audience would be great.

      Any news Andrea ?

  4. I don’t see any proper information on how many requests have been handled in parallel and how long the whole action took in common. Would it be possible to provide this information also?

  5. Creating a new EC2 node on amazon takes long time comparing to running a new container, so how did you manage creating EC2 instances?

  6. Hi Andrea,

    I was watching the video on Docker Swarm Website and I noticed a simple Web UI to display docker active nodes and containers. It was called “swarm-master-demo:3000”.

    Is this page available some where in GitHub?
    Many thanks,


  7. im using the swarm-bench tools, so how can i push the container to other node, it seems like only starting the container on the node that i use.

  8. Avatar

    Allen McPherson

    Nice work, but those t2.micros are not real nodes. My understanding is that they are "nodes" (virtual) mapped on to real nodes (hardware). So, given a not unreasonable 64-code node, 1000 "nodes" would consume only ~16 hardware nodes with the attendant savings in network traffic, etc.

    It would be interesting to see how things scale on 1000 real nodes. Of course, you'd need millions of containers for that test.

    Anyway, this is really cool stuff!

Leave a Reply