Using Docker for a reproducible deployment environment! Now that I have your attention with some buzzwords, welcome to the second of three parts in my series on how I deploy this site using Deb Constrictor. The last post was about building and deploying the application code, this one is about building the Python virtual environment consistently on any machine. I’ll go into more detail on the variables and pre-build steps to accomplish this with less repeated code, and how to build and release the package with one command.
Why separate the virtualenv?
I like to fully utilise a server by running multiple projects on it, but some of these projects depend on Python 2, and some on Python 3, as well as different versions of libraries. Using virtualenvs allows the server to cater for this. The reason the virtualenv is seperate from the application package is because the application code probably changes a lot more often than the virtualenv, so it is useful to be able to deploy the small application package quickly without having to wait for the full virtualenv to be built, uploaded and installed.
Variables
To begin, let’s look at the build-config.json
file for the virtualenv project:
{ "parent": "../parent-config.json", "package": "${VENV_NAME}", "architecture": "amd64", "version": "1.9", "description": "The virtual environment for BBIT Web Frontend", "deb_constrictor": { "environment_variables": [ ["PYTHON_VERSION", "2.7"], ["VENV_NAME", "${PROJECT_NAME}-virtualenv"], ["VENV_DIR", "build/${VENV_NAME}"], ["VENV_BIN_DIR", "${VENV_DIR}/bin"] ], "commands": { "prebuild": ["~/deb-constrictor/build-venv.sh"] }, "remove_after_build": true }, "extra_control_fields": { "Depends": [ "python${PYTHON_VERSION}", "libpython${PYTHON_VERSION}" ] }, "directories": [ { "source": "build/virtualenvs/${VENV_NAME}", "destination": "/var/virtualenvs/${VENV_NAME}" } ], "links": [ { "path": "/var/virtualenvs/${VENV_NAME}/lib/python${PYTHON_VERSION}/encodings", "target": "/usr/lib/python${PYTHON_VERSION}/encodings" }, { "path": "/var/virtualenvs/${VENV_NAME}/lib/python${PYTHON_VERSION}/lib-dynload", "target": "/usr/lib/python${PYTHON_VERSION}/lib-dynload" } ] }
The file starts with the parent
attribute, which is the same as the application’s build-config.json
(detailed in the previous post). The same PROJECT_NAME
variable is used in all child projects (application, virtualenv and config). Since this variable is defined in the parent configuration it will be initialised and ordered before any others.
The order in which variables are read and applied is important, which is why variables are defined as a list of list-pairs instead of a dictionary. PROJECT_NAME
is needed to interpolate the VENV_NAME
, which is in turn needed to interpolate the VENV_DIR
, and so on. Keeping all these in a dictionary would mean the ordering would be indeterminate so this dependency would be lost.
Notice also that the variables are defined under the environment_variables
key, which means they will be exported into the environment and available to command scripts. Variables can also be set under the variables
key, if you don’t need/want them in the environment.
Using variables in this way means that essentially the same build configuration JSON file can be re-used for multiple projects, updating the PYTHON_VERSION
for 2/3
Building the Virtual Environment
My daily development machine is a MacBook Pro running macOS, but I usually deploy to Ubuntu hosts. Because many Python libraries use C (or other low-level languages) for bindings and link to specific library paths, not to mention the different output binary formats when compiled (ELF/Mach-O), it is impossible to build your virtualenv on macOS and deploy to Linux. Enter Docker.
Previously I had started a VM running whatever target OS I was deploying to, and ran the virtualenv build script there. This meant making sure the latest source was deployed to that machine, and it was an extra manual step. Using a lightweight Docker instance lets me automate this process without much resource overhead.
Some previous clients I have worked for also sometimes built the virtualenv on the server itself at deployment time, but this could mean long downtimes due to failed builds that were untested. I definitely didn’t want to go this route.
The virtualenv is built by running the build-venv.sh
script defined in the prebuild
attribute of the build configuration. Its contents are:
#!/bin/bash set -e DOCKER_IMAGE_NAME=`echo ${VENV_NAME}-builder | cut -c1-128` cd ${DEB_CONSTRICTOR_WORKING_DIR} rm -r build/virtualenvs/${VENV_NAME} || true mkdir -p build/virtualenvs docker build -t ${DOCKER_IMAGE_NAME} . docker run -v `pwd`/build/virtualenvs:/virtualenvs ${DOCKER_IMAGE_NAME} virtualenv \ -p python${PYTHON_VERSION} --no-site-packages /virtualenvs/${VENV_NAME} docker run -v `pwd`/build/virtualenvs:/virtualenvs ${DOCKER_IMAGE_NAME} \ /virtualenvs/${VENV_NAME}/bin/pip install -U pip docker run -v `pwd`/build/virtualenvs:/virtualenvs -v `pwd`/src:/src -t \ ${DOCKER_IMAGE_NAME} /virtualenvs/${VENV_NAME}/bin/pip install -r /src/requirements.txt docker run -v `pwd`/build/virtualenvs:/virtualenvs ${DOCKER_IMAGE_NAME} virtualenv \ --relocatable /virtualenvs/${VENV_NAME}
The environment variables that were defined in the build-config.json
file are available in this script so it can be run in the prebuild step by multiple projects without modification. To quickly explain the script, it runs docker build
in the current directory (the directory containing build-config.json
and Dockerfile
must be the cwd). The virtualenv is then built in a Docker volume which is accessible from the host machine. The requirements.txt
file in the code repository is read from the mounted src
directory. These requirements are installed, and then the virtualenv is made relocatable (this step is needed to fix up paths that won’t match between the build machine and server).
Dockerfile
The Dockerfile is fairly simple, it uses ubuntu:18.04 as a base, since that’s what I deploy to, and then makes sure any required packages are installed (typically these will also need to be installed on the server when deployed). Here it is:
FROM ubuntu:18.04 VOLUME ["/virtualenvs", "/src"] RUN apt-get update && apt-get dist-upgrade -y && apt-get install -y python2.7 python-pip postgresql-client && pip install virtualenv
Since the packages that each project needs might differ (e.g. python2.7 instead of python3.6) this Dockerfile is part of the project and stored in the project directory. Because of the way Docker images are built, new intermediary volumes are not necessarily created if the build steps are the same as another project’s
Putting It All Together
On executing constrictor-build
, first the virtualenv is constructed with Docker, as mentioned it is readable by the host system because is created in a volume passed through from the host. Continuing through the build configuration file, the package is made to depend on the python
/libpython
of the version specified. The host can then continue with the normal build process, packaging the virtualenv code and telling it to be installed into /var/virtualenvs/${VENV_NAME}
, again this is interpolated with a variable. Two links are also created since they don’t exist when the virtualenv moves systems. As with the previous post (regarding packaging the application code) a post-build step to upload the built DPKG to the apt repository is included in the base constrictor-build-config.json
in my home directory.
The variable and pre-build features added in version 0.3 of Deb Constrictor means reduction of boilerplate code and more automation that allows me to build and release the virtualenv package using just one command.
In the third and final in this series I will document how to deploy a package for configuration, using configuration specific features in new Deb Constrictor 0.4.