The swarm mode with docker service command introduced in version 1.12.0 aims to be a good tool for scaling your application and one of the nice feature promised is zero downtime deployments, which I’m going to try in this post.
UPD: Unfortunately latest docker versions keeps forwarding new connections to removed service tasks,
while containers got SIGTERM and started graceful shutdown,
which makes no-downtime rolling updates impossible, at least without using an external load balancer.
There are few related issues which should fix it (still open):
You can subscribe for issue notifications to get updates, meanwhile - get familiar with deployment process …
Application
First - we need our mission critical application to be packed in a docker image, and should respond to SIGTERM signal to gracefully shutdown in timely manner.
Bootstrap
I’ll use a basic expressjs app generated with express-generator, which will just respond with a plain text response on it’s root path (source code can be found on github).
Build the image and push to docker hub (or your private registry):
To be sure it works - run the app with docker (don’t forget to stop previously running node application, ctrl+c may help with this), and then stop container with docker stop:
Our benchmark looks pretty similar at this step too:
Docker Service
Next thing to do - is to prepare our cluster, if it’s not yet - init swarm cluster:
Create network:
Create service and scale to 6 instances:
Now, let’s try to scale service down to 2 items while putting service under load (run commands same time):
Pretty good, all requests succeed, let’s scale service up:
Oops, looks like containers are added to service before node socket binding gets ready, which probably can be fixed with another great feature released with docker 1.12 - HEALTHCHECK (can also be added on container build time).
For checking service is up - will use simple curl command: curl --fail http://localhost:3000 (which is available by default in node docker images, at least the one used for our app, didn’t check in slim versions)
Re-create our service with healthcheck instructions:
And benchmark again:
While scaling it up and down:
Tadaaa! All requests succeed.
For deploy process we will use docker service update that technically will do the same: scale service down and back up with new options, ex: --image.
Create new version of our app:
And finally deploy:
While benchmarking:
Conclusion
Health check is a very important part of deploy process and let us controll when exactly container is ready to be added to the swarm. Of course you should tune --helth-* parameters for your requirements and ideally create a separate endpoint which will incapsulate all your healthcheck logic. For deploy process you’ll need at least one container running at a time or more, that can handle the load, so please also check --update-parallelism, --update-delay and --stop-grace-period 5s params in more depth.
Useful links
Here are some useful links which I discovered while solving this problem.