Deb Constrictor for Virtualenv Deployment (Part 2)

Using Docker for a reproducible deployment environment! Now that I have your attention with some buzzwords, welcome to the second of three parts in my series on how I deploy this site using Deb Constrictor. The last post was about building and deploying the application code, this one is about building the Python virtual environment consistently on any machine. I’ll go into more detail on the variables and pre-build steps to accomplish this with less repeated code, and how to build and release the package with one command.

Why separate the virtualenv?

I like to fully utilise a server by running multiple projects on it, but some of these projects depend on Python 2, and some on Python 3, as well as different versions of libraries. Using virtualenvs allows the server to cater for this. The reason the virtualenv is seperate from the application package is because the application code probably changes a lot more often than the virtualenv, so it is useful to be able to deploy the small application package quickly without having to wait for the full virtualenv to be built, uploaded and installed.

Variables

To begin, let’s look at the build-config.json file for the virtualenv project:

{
  "parent": "../parent-config.json",
  "package": "${VENV_NAME}",
  "architecture": "amd64",
  "version": "1.9",
  "description": "The virtual environment for BBIT Web Frontend",
  "deb_constrictor": {
      "environment_variables": [
        ["PYTHON_VERSION", "2.7"],
        ["VENV_NAME", "${PROJECT_NAME}-virtualenv"],
        ["VENV_DIR", "build/${VENV_NAME}"],
        ["VENV_BIN_DIR", "${VENV_DIR}/bin"]
      ],
      "commands": {
        "prebuild": ["~/deb-constrictor/build-venv.sh"]
      },
      "remove_after_build": true
  },
  "extra_control_fields": {
    "Depends": [
      "python${PYTHON_VERSION}",
      "libpython${PYTHON_VERSION}"
    ]
  },
  "directories": [
    {
      "source": "build/virtualenvs/${VENV_NAME}",
      "destination": "/var/virtualenvs/${VENV_NAME}"
    }
  ],
  "links": [
    {
      "path": "/var/virtualenvs/${VENV_NAME}/lib/python${PYTHON_VERSION}/encodings",
      "target": "/usr/lib/python${PYTHON_VERSION}/encodings"
    },
    {
      "path": "/var/virtualenvs/${VENV_NAME}/lib/python${PYTHON_VERSION}/lib-dynload",
      "target": "/usr/lib/python${PYTHON_VERSION}/lib-dynload"
    }
  ]
}

The file starts with the parent attribute, which is the same as the application’s build-config.json (detailed in the previous post). The same PROJECT_NAME variable is used in all child projects (application, virtualenv and config). Since this variable is defined in the parent configuration it will be initialised and ordered before any others.

The order in which variables are read and applied is important, which is why variables are defined as a list of list-pairs instead of a dictionary. PROJECT_NAME is needed to interpolate the VENV_NAME, which is in turn needed to interpolate the VENV_DIR, and so on. Keeping all these in a dictionary would mean the ordering would be indeterminate so this dependency would be lost.

Notice also that the variables are defined under the environment_variables key, which means they will be exported into the environment and available to command scripts. Variables can also be set under the variables key, if you don’t need/want them in the environment.

Using variables in this way means that essentially the same build configuration JSON file can be re-used for multiple projects, updating the PYTHON_VERSION for 2/3

Building the Virtual Environment

My daily development machine is a MacBook Pro running macOS, but I usually deploy to Ubuntu hosts. Because many Python libraries use C (or other low-level languages) for bindings and link to specific library paths, not to mention the different output binary formats when compiled (ELF/Mach-O), it is impossible to build your virtualenv on macOS and deploy to Linux. Enter Docker.

Previously I had started a VM running whatever target OS I was deploying to, and ran the virtualenv build script there. This meant making sure the latest source was deployed to that machine, and it was an extra manual step. Using a lightweight Docker instance lets me automate this process without much resource overhead.

Some previous clients I have worked for also sometimes built the virtualenv on the server itself at deployment time, but this could mean long downtimes due to failed builds that were untested. I definitely didn’t want to go this route.

The virtualenv is built by running the build-venv.sh script defined in the prebuild attribute of the build configuration. Its contents are:

#!/bin/bash
set -e

DOCKER_IMAGE_NAME=`echo ${VENV_NAME}-builder | cut -c1-128`

cd ${DEB_CONSTRICTOR_WORKING_DIR}
rm -r build/virtualenvs/${VENV_NAME} || true
mkdir -p build/virtualenvs
docker build -t ${DOCKER_IMAGE_NAME} .
docker run -v `pwd`/build/virtualenvs:/virtualenvs ${DOCKER_IMAGE_NAME} virtualenv \
    -p python${PYTHON_VERSION} --no-site-packages /virtualenvs/${VENV_NAME}

docker run -v `pwd`/build/virtualenvs:/virtualenvs ${DOCKER_IMAGE_NAME} \
    /virtualenvs/${VENV_NAME}/bin/pip install -U pip

docker run -v `pwd`/build/virtualenvs:/virtualenvs -v `pwd`/src:/src -t \
    ${DOCKER_IMAGE_NAME} /virtualenvs/${VENV_NAME}/bin/pip install -r /src/requirements.txt

docker run -v `pwd`/build/virtualenvs:/virtualenvs ${DOCKER_IMAGE_NAME} virtualenv \
    --relocatable /virtualenvs/${VENV_NAME}

The environment variables that were defined in the build-config.json file are available in this script so it can be run in the prebuild step by multiple projects without modification. To quickly explain the script, it runs docker build in the current directory (the directory containing build-config.json and Dockerfile must be the cwd). The virtualenv is then built in a Docker volume which is accessible from the host machine. The requirements.txt file in the code repository is read from the mounted src directory. These requirements are installed, and then the virtualenv is made relocatable (this step is needed to fix up paths that won’t match between the build machine and server).

Dockerfile

The Dockerfile is fairly simple, it uses ubuntu:18.04 as a base, since that’s what I deploy to, and then makes sure any required packages are installed (typically these will also need to be installed on the server when deployed). Here it is:

FROM ubuntu:18.04
VOLUME ["/virtualenvs", "/src"]
RUN apt-get update && apt-get dist-upgrade -y && apt-get install -y python2.7 python-pip postgresql-client && pip install virtualenv

Since the packages that each project needs might differ (e.g. python2.7 instead of python3.6) this Dockerfile is part of the project and stored in the project directory. Because of the way Docker images are built, new intermediary volumes are not necessarily created if the build steps are the same as another project’s

Putting It All Together

On executing constrictor-build, first the virtualenv is constructed with Docker, as mentioned it is readable by the host system because is created in a volume passed through from the host. Continuing through the build configuration file, the package is made to depend on the python/libpython of the version specified. The host can then continue with the normal build process, packaging the virtualenv code and telling it to be installed into /var/virtualenvs/${VENV_NAME}, again this is interpolated with a variable. Two links are also created since they don’t exist when the virtualenv moves systems. As with the previous post (regarding packaging the application code) a post-build step to upload the built DPKG to the apt repository is included in the base constrictor-build-config.json in my home directory.

The variable and pre-build features added in version 0.3 of Deb Constrictor means reduction of boilerplate code and more automation that allows me to build and release the virtualenv package using just one command.

In the third and final in this series I will document how to deploy a package for configuration, using configuration specific features in new Deb Constrictor 0.4.

Previous entry

Next entry

Related entries

Deb Constrictor: Multiple Parents   |   Deb Constrictor for Configuration Deployment (Part 3)   |   Deb Constrictor for Application Deployment (Part 1)   |   The Ultimate Python Deployment System I