Deploying containerized Docker instances in production?
Hello! After spending many development hours in my past years running on Virtualbox/Vagrant-style setups, I've decided to take the plunge into learning Docker, and after getting a few containers working, I'm now looking to figure out how to deploy this to production. I'm not a DevOps or infrastructure guy, my bread and butter is software, and although I've become significantly better at deploying & provisioning Linux VPS's, I'm still not entirely confident in my ability to deploy & manage such systems at scale and in production. But, I am now close to running my own business, so these requirements are suddenly going from "nice to have" to "critical".
As I mentioned, in the past when I've previously developed applications that have been pushed onto the web, I've tended to develop on my local machine, often with no specific configuration environment. If I did use an environment, it'd often be a Vagrant VM instance. From here, I'd push to GitHub, then from my VPS, pull down the changes, run any deployment scripts (recompile, restart nginx, etc), and I'm done.
I guess what I'm after with Docker is something that's more consistent between dev, testing, & prod, and is also more hands off in the deployment process. Yet, what I'm currently developing still does have differing configuration needs between dev and prod. For example, I'd like to use a hosted DB solution such as DigitalOcean Managed Databases in production, yet I'm totally fine using a Docker container for MySQL for local development. Is something like this possible? Does anyone have any recommendations around how to accomplish this, any do's and dont's, or any catches that are worth mentioning?
How about automating deployment from GitHub to production? I've never touched any CI/CD tools in my life, yet I know it's a hugely important part of the process when dealing with software in production, especially software that has clients dependent on it to function. Does anything specifically work well with Docker? Or GitHub? Ideally I want to be avoiding manual processes where I have to ssh in, and pull down the latest changes, half-remembering the commands I need to write to recompile and run the application again.
It sounds like this is a one-person show, you've only got one or maybe two apps, and those apps have pretty static deployments? If I'm off the mark on any of those, my advice may be poor.
First and foremost: you don't need a container orchestration solution (Kubernetes, Openshift, Nomad, ECS, etc) at that scale. Container orchestration solutions come with learning curves, operation and maintenance overhead, and/or lock-in problems, and they solve issues you really don't have yet. You may not need an orchestration platform until you can also hire somebody else to manage it.
Here's a pretty simple, sustainable, vendor-agnostic way to run containers on a production server. Write a system service (systemd or whatever you have) that runs an arbitrary container; you can find some examples of this sort of thing easily and modify to purpose. Map whatever port your container uses to some high port on localhost. Then run a webserver with a reverse proxy to terminate TLS and forward to that container's port. If you put a
docker pullin the start behavior of the service (like with a
ExecStartPrefor systemd), you can "deploy" your app by just restarting the service!
It's inevitable that your production and staging environments will be different from each other; you minimize the differences you can, and then just try to be aware of the rest. Docker is especially good at running dependency services for local development, so it's pretty normal to use a Docker-hosted Postgres for local development, and then a managed high-performance Postgres in production. Again, it's about minimizing differences – if you use MySQL locally but Postgres in production you will be sad, and you can even pin your development Postgres version to whatever you have in production, but you're not going to get a production performance profile unless you're duplicating your production environment.
I don't know that much about CI, but I would advise against Jenkins; although it's very easy to get set up, my experience is that it's a real time bomb (unless you really know how to manage it, which mostly comes from having previously shot yourself in the foot repeatedly). I believe there are several Docker-optimized CI systems out there, but I can't personally vouch for any of them.
Interesting, thanks! You've definitely got my use case correct, right now it's just myself and my business partner who is not a developer. Orchestration seems like too big of a step right now, so I don't want to go down that road yet— but the
systemdsolution sounds very nice. I've done a bit of reading, and found this article and this one which appear to be the most helpful, however I've got a few questions about your suggestions which I need to wrap my head around...
By this, you mean having an entry like this in a docker-compose.yml file?
Except with this
systemdsolution, you're not using
docker-composeright? Management is at the container level with the
dockerCLI? So effectively you'd be mapping say
80inside your container to
8080on the VPS?
Say that webserver is nginx, is that reverse proxy also in a container, or is it deployed directly to the host?
docker pull, to my knowledge, only updates the image associated with the container, right? It doesn't update your application inside the container, so I'd still need a step to
git pullthe repository into the container?
Reasonable questions! I was pretty loose on some details, partly for lack of clarity on what you already knew, partly for plain laziness.
Pretty much; I was imagining the
docker runand assuming you'd only run single containers in production. I usually see docker-compose files used to orchestrate environments (your backing Postgres and backing Elasticsearch and etc) for local development, and then all the equivalent production services are managed with something else, because they're not running in Docker outside of development. (docker-compose can be used in production environments – I've seen it done – but I don't know if it's actually a good idea.)
Eh. I would (and do) deploy it directly to the host, but I'm an ops guy and I have tooling for managing an nginx install. It might be fine to run nginx in a container, although I don't suggest baking your SSL certs into an image, and that means you have to do local disk mounts and at that point I don't know how much complexity Docker is saving you.
This question confuses me and makes me think we might have different ideas of what a Docker build workflow looks like. Yes,
pullupdates the image associated with the container (and you'd still need to
docker restartthe container to get the container to use the newer image), but the updated image should have your updated application; that's the reason there's an update.
I guess I'm assuming that you're building images for your application (probably by using a Dockerfile that defines an upstream base image for whatever language your application needs, mounts your local git repo with the release version, and builds it, and then you'd push that image to a private registry). But maybe that's not what you're doing, in which case... what are you doing? What image would you pull to production, if not the image of your app?
Great response, thank you. I did a bit of reading on the fundamentals of docker & CI/CD over the weekend so I'm probably in a bit of a better place to address this comment now.
From what I gather,
docker composein production isn't very common for the reasons you outline: often your databases exist in a managed environment, and you'll want to manage and deploy containers individually, so I'm not going to go down this route.
I'm still undecided on running nginx within a container; you raise a good point about having to bind mount the SSL cert directory adding additional compexity, but I'm tempted to agree with answers like this that it gives you a bit more flexibility in creating and destroying your containers & servers as necessary.
You're right. At the moment all I'm doing is just installing a base image and making some minor modifications to it, the actual contents of the application is being bind mounted into the container. But I think I understand now it'd be better to have a CI/CD server pull & create a custom base image from a private Docker repository, then have that custom image with the completed build be pulled down by the host as needed, which aligns with your comment of "a pull becomes an update of your app".
Does that sound about right?
You're also the second person to mention not to use Jenkins; so I think I'll heed this advice and use something like TeamCity.
I run a few different websites on one $10/mo Vultr server, which is basically the same as Digital Ocean. It's set up like this:
This server is also my CI server because I don't deploy often and high CPU usage for a few minutes to build a docker image is acceptable. I get reliable automatic deploys on commits to master for all of my projects, which has been nice for accepting pull requests from my phone and having them automatically deploy when I'm not near a computer. To achieve, this, I:
Thanks! I haven't seen anyone else using
docker-composespecifically, usually others seem to map a single
systemdinstance to a single docker container and disregard
docker-compose. A few questions:
I don't think I need a separate CI server either, so I like your approach here.
:zpart is important.
Definitely possible. You probably use either a connection string (
mysql://...) or equivalent environment variables (host/username/password) to connect to the prod database, correct? You'll connect to the dockerized MySQL the same way. If you add
-p 3306:3306to your docker run command then the dockerized MySQL is available at
localhost:3306just as if it were installed outside Docker.
One of the greatest strengths of Docker is also one of the biggest caveats you should be aware of - by default everything in a container is thrown away when you
docker rmit. This means you can trivially have a "fresh install" of MySQL for your app to use, but it does mean you'll need an extra step (volume-mounting the MySQL data directory to somewhere on your host) to have the data preserved. If you read the Docker Hub page for MySQL it has example commands you can use.
You'll read a lot about
docker-composeas you research Docker - my suggestion would be to ignore it for now, learn the "raw" docker commands first, then if you want to use Docker Compose you'll understand what it's doing behind the scenes for you.
In your shoes, where your focus is running your business and you want the tech stack to "just work", I'd recommend Amazon RDS (their managed DB) plus ECS (their managed Docker hosting). Digital Ocean has managed DBs, but their managed Docker hosting is done with Kubernetes, which is probably more complex than you want to deal with.
The alternative to ECS would be to run your own Linux servers, install the Docker daemon on them, then run an orchestration layer on top of that (such as Docker Swarm, Nomad, or Kubernetes). If you want to do this I'd recommend CoreOS or NixOS as the base distro and Nomad for the orchestration layer, but I think it's overkill for the scale you've described.
For deployments I'd suggest Terraform from Hashicorp. It has plugins for all the various AWS services you'll use (as well as others) and can simplify deployment down to a
terraform runcommand in a Git repo with declarations of how your production app should be deployed.
If you're not tightly tied to GitHub, one thing you might consider is switching to GitLab. As far as core Git functionality goes, they're more or less identical, but GitLab has built-in CI which would be one fewer thing you need to set up & maintain.
I'd also recommend reading the 12 factor app which is a set of design guidelines meant to deal exactly with this sort of "dev, test and prod should be exactly identical, except for where they need to be different" problem.
There are quite a few options for managing docker containers. As always there is a bunch of tradeoffs to be considered.
If you only have a single container running a webapp then starting out similar to your current approach might be a good way of learning the tool. I.e develop locally running the code in a container, push to GitHub, pull down changes in your prod env and build the docker image there, then run it.
If you have multiple containers that needs to run, eg. a webapp, a redis cache, and a nginix container, then using docker compose to ensure everything that get started properly and with the right config. In this scenario all the containers would still be running on the same server. You could have the same deployment setup as before in this case eg build images on prod server. In addition it would be fairly straightforward to have a base docker compose yaml file and then have two more with the config for prod and for dev (e.g. one where the webapp gets pointed to talk to the managed prod db and one where it talks to a local db container).
If you need to run multiple containers of the same image, e.g you have some queue and a bunch of processing containers possibly running on multiple servers. Then using something like kubernetes/rancher/openstack makes sense. This is where you stop treating your servers as pets (each server has a unique memorable name) and start treating them as cattle (they are identified by some number and tagged with their purpose some how). Note that this setup has quite a bit more complexity to it and learning to configure and maintaining a kubernetes cluster could basically be a full-time job/entire career. So, that said, I'd hold off on this until it is really needed.
There are a quite a few other approaches you could take, so follow the KISS rule and grow a process that works for you (and your team). The above options are basically not tied to any particular CI/CD setup. I would probably start by just having a CI environment build and run your test suite. Once that works, have it bring up a test environment to run some integration tests. Next write some basic end to end tests that ensures the entire stack works for some basic workflows.
After that is done is when I would first consider having fully automated deploys. You could of course have a manually triggered deployment job on your CI server to make deploys more consistent before all the test suites are in place.