Docker is an open platform for developers and sysadmins to build, ship, and run distributed applications. For our purposes, Docker containers give us access to a jailed operating system that any of our services can run in with almost none of the overhead of a virtual machine.
The many uses of Docker are still being realized and the broader community continues to determine what the best practices are. It is not just used by operations and development. For example, Melissa Gymrek wrote about using Docker to support reproducible research, using a PLOS Genetics article as a test case.
Some months back I began using Docker on a few projects that needed cross team development. We liked Docker, but we were not yet ready to take it to production deployments. Around the same time, our team was without a QA engineer, so we were tasked with setting up our own testing plan. As we talked about it, Docker seemed like a perfect fit for what we needed. We created test environments that mimic production using Docker containers and found that Docker makes it quick and easy to narrow the gap between production and development.
Dockerfiles are used to define an exact template for what each application needs, from operating system to libraries to open ports. In the git repository for each of our projects we include a Dockerfile right along with the source code. This way anyone who pulls the project can get it up and running with a few simple commands. The user no longer needs to install all the dependencies manually and then hope it works on their own computer, since this will all be taken care of in the Docker container. This means getting up and running is fast since the host machine requires nothing installed but the Docker service itself.
Docker containers are very lightweight, so it is beneficial to run only one application per container. Since your application likely has dependencies on other services (database, caching server, message queue, etc), you will need multiple Docker files to bring up a working stack. This is where Docker Compose comes in.
Docker Compose enables us to easily manage a collection of containers by defining them in a short yaml file. With a ‘docker-compose up’, our entire stack is up and running in a few seconds. Here is a sample compose.yml file:
Black Box Testing
Our web service container exposes port 8080 so it can be reached from the host. To run our API tests all we have to do is point them at that port, and we are testing the Dockerized API instead of a service running directly on the host. I wrote a few small scripts that orchestrate bringing those containers up and down as needed between various tests and now we have a very fast and comprehensive integration testing system.
You can use compose to bring up any number of instances of the services in your stack with something like:
> docker-compose scale web=10 database=3
This is great for testing things like database clusters or load balancers. I have seen hundreds of database containers run on a single laptop with no problem. Try doing that with a virtual machine.
It is easy to add very invasive automated tests here as well. Let’s say we want to automate the testing of different system configurations. We usually use MySQL, but this service can also work with a Postgres backend. This test can be added in a few minutes. Just create a Postgres container (or find an existing one on Dockerhub), and modify our test script to run as follows:
- Bring up new stack (with MySQL backend), Seed with data, Run API tests, Tear down stack.
- Bring up new stack (with Postgres backend), Seed with data, Run API tests, Tear down stack.
- Report on any differences in test results between MySQL and Postgres backends.
In the past, some of our system tests used a different database technology than the production deployment. We used an in-memory H2 database since it is faster to bring up than a full database server. This has given us problems since we need to manage additional drivers specific to testing, and H2’s compatibility mode for MySQL only implements some of the MySQL feature set. Since Docker allows us to quickly create production-like systems, we have completely removed H2 and test right on a MySQL container. Now that we have fewer dependencies, we have less code, and fewer points of failure.
Now that our stack is Dockerized, anyone can easily run the tests. New developers can run tests against their new code right away, leading to fewer checked in bugs. And QA can focus on tasks beyond regression testing.
Speaking of QA, we don’t need special servers that only QA has access to. Anyone with Docker installed can not only run the whole application stack but can run the full test suite. The same minimal requirements are in place on our continuous integration build server. We don’t need to take up any public facing ports. When the build finishes, it brings up a whole stack, tests it inside the local Docker service, and tears it down. Did I mention the overhead to run these tests against a brand new full stack is only seconds?! I probably did. But it’s worth saying again.