It has recently come to my attention that many people don’t use virtual machines for development, instead polluting their system with various dependencies and making life harder for themselves. Unfortunately, even the people who do use VMs for development often perform provisioning and updates by hand, thus ending up with slightly different configurations for development, testing and production, which often leads to uncaught bugs on production.

In this post, I will not only attempt to detail some best practices I’ve learned, but I will also list provisioning and deployment configurations that will make this a one-command process.

The easiest way to do repeatable deployments is to create scripts which will handle everything for you. By the end of this post, you will be able to get from a new installation to a complete Django/postgres/gunicorn/redis stack running (and accepting users) with one command.

Starting off, the most important thing to remember is that you should never make any changes to any of the machines by hand. Any change you perform must be automatic and repeatable. If you make a change on the development VM, you’d better be damn sure it’s documented somewhere and will run on every other environment, including other people’s VMs, staging, and production.

The way to do this is with a deployment framework and scripts. You can use Puppet/Chef/Salt stack/CFengine/whatever you like (but don’t use fabric! Fabric is great for some cases, but it’s not for deployment). My tool of choice is Ansible, it’s simple to learn and extend, does the job quickly and without hassle.

Step 0: Directory structure.

Here’s the basic directory/file structure I use for my projects:

.
├── deployment/
│   ├── ansible
│   ├── deploy.yml
│   ├── files/
│   │   ├── conf/
│   │   │   └── nginx.conf
│   │   ├── init/
│   │   │   └── gunicorn.conf
│   │   └── ssl/
│   │       ├── myproject.csr
│   │       ├── myproject.key.encrypted
│   │       └── myproject.pem
│   ├── handlers.yml
│   ├── hosts
│   ├── key
│   ├── known_hosts
│   ├── provision.yml
│   ├── vars.yml
│   └── webapp_settings/
│       ├── local_settings.local.py
│       ├── local_settings.production.py
│       └── local_settings.staging.py
├── djangoproj/
└── requirements.txt

You will notice the deployment directory, containing various ansible scripts, configuration, SSL certificates, init scripts, etc. This will all be explained later.

Step 1: The hosts.

Before doing anything else, we need to declare the hosts file. This is what will run where. The hosts file in the deployment directory has the following contents:

[remote:children]
production
staging

[servers:children]
production
staging
local

[production]
www.myproject.com nickname=production vm=0 branch=master

[staging]
staging.myproject.com nickname=staging vm=0 branch=develop

[local]
local.myproject.com nickname=local vm=1 branch=develop

As you can see, I define three hosts: local, staging and production. local is the VM I use for development (that’s what the vm=1 means), and it tracks the develop git branch. staging is the staging server, it’s not a VM (vm=0) and it also tracks develop, and production tracks master.

The various sections are just so I can refer to the various machines more easily. I can deploy everything to remote, which includes production and staging, or I can deploy to local, which is just the local VM, or I can deploy to servers, which is everything.

Step 2: Setting up a VM.

For my VMs, I use VirtualBox. It’s free, it’s great, if you aren’t using it already, what are you waiting for? Go get it and start setting up VMs.

To develop, we will need to set up our VM to share a code directory with the host computer, so we can easily see the changes we make without needing to commit or do anything else. I add a shared directory (read/write, because that’s sometimes necessary for migrations) to the VM, and I give it the project’s name, as we will need it later on in the scripts.

Step 3: The variables.

This is the vars.yaml file. It’s pretty straightforward, it includes some names and the system packages/python packages/init files you would like to install.

---
project_name: myproject
project_root: /var/projects/myproject
project_repo: git@bitbucket.org:myuser/myproject.git
system_packages:
  - build-essential
  - git
  - libevent-dev
  - nginx
  - postgresql
  - postgresql-server-dev-all
  - python-dev
  - python-setuptools
  - redis-server
  - postfix
python_packages:
  - pip
  - virtualenv
initfiles:
  - gunicorn

Step 4: Set everything up.

This is where the provisioning script comes in. Using my script will force you to use my dir/env structure, which might not be what you’d prefer, but you can change them easily enough. These scripts create a user called {{ project_name }} and a directory called /var/projects/{{ project_name }}, and install everything in that directory as that user.

The provisioning script, provision.yml, contains:

---
- hosts: servers
  vars_files:
    - vars.yml
  gather_facts: false
  sudo: true

  tasks:
  - name: Create the project directory.
    file: state=directory path={{ project_root }}

  - name: Create user.
    user: home={{ project_root }}/home/ name={{ project_name }} state=present

  - name: Update the project directory.
    file: group={{ project_name }} owner={{ project_name }} mode=755 state=directory path={{ project_root }}

  - name: Create the code directory.
    file: group={{ project_name }} owner={{ project_name }} mode=755 state=directory path={{ project_root }}/code/

  - name: Install required system packages.
    apt: pkg={{ item }} state=installed update-cache=yes
    with_items: {{ system_packages }}

  - name: Install required Python packages.
    easy_install: name={{ item }}
    with_items: {{ python_packages }}

  - name: Mount code folder.
    mount: fstype=vboxsf opts=uid={{ project_name }},gid={{ project_name }} name={{ project_root }}/code/ src={{ project_name }} state=mounted
    only_if: "$vm == 1"

  - name: Create the SSH directory.
    file: state=directory path={{ project_root }}/home/.ssh/
    only_if: "$vm == 0"

  - name: Upload SSH known hosts.
    copy: src=known_hosts dest={{ project_root }}/home/.ssh/known_hosts mode=0600
    only_if: "$vm == 0"

  - name: Upload SSH key.
    copy: src=key dest={{ project_root }}/home/.ssh/id_rsa mode=0600
    only_if: "$vm == 0"

  - name: Create the SSL directory.
    file: state=directory path={{ project_root }}/home/ssl/

  - name: Upload SSL private key.
    copy: src=files/ssl/{{ project_name }}.pem dest={{ project_root }}/home/ssl/{{ project_name }}.pem

  - name: Upload SSH public key.
    copy: src=files/ssl/{{ project_name }}.key.encrypted dest={{ project_root }}/home/ssl/{{ project_name }}.key

  - name: Change permissions.
    shell: chown -R {{ project_name }}:{{ project_name }} {{ project_root }}

  - name: Install nginx configuration file.
    copy: src=files/conf/nginx.conf dest=/etc/nginx/sites-enabled/{{ project_name }}
    notify: restart nginx

  - name: Install init scripts.
    copy: src=files/init/{{ item }}.conf dest=/etc/init/{{ project_name }}_{{ item }}.conf
    with_items: {{ initfiles }}

  - name: Create database.
    shell: {{ project_root }}/env/bin/python {{ project_root }}/code/webapp/manage.py sqlcreate --router=default | sudo -u postgres psql

  handlers:
    - include: handlers.yml

- include: deploy.yml

- hosts: servers
  vars_files:
    - vars.yml
  gather_facts: false
  sudo: true

  tasks:
  - name: Restart services.
    service: name={{ project_name }}_{{ item }} state=restarted
    with_items: {{ initfiles }}

This should be pretty easy to follow. What it does is:

  • Sets up the directory and user for the code.
  • Installs system packages using apt (I use Ubuntu), and Python packages from the requirements.txt file using pip.
  • Adds an fstab entry to mount the code folder from the host on boot, if the server is a local VM.
  • Creates the .ssh directory, uploads keys/etc so we can pull from BitBucket (or GitHub), but only if this server isn’t a local VM.
  • Uploads SSL certificates. This is all in our repo, but they get copied because I store the certificates (and other sensitive files) encrypted using git-crypt. That’s why some files have the suffix .encrypted, it’s because I don’t want everyone with read access to the repo seeing what they contain.
  • Installs config files and init scripts.
  • Creates the database.
  • Performs all the actions in deploy.yml, which we will see next, to perform the first deployment.
  • After everything is done, it restarts all services.

Step 5: Deployment.

The difference between provisioning and deployment is that provisioning is done once, deployment is done every time you need to push something to production or staging. The deploy.yml is:

---
- hosts: servers
  vars_files:
    - vars.yml
  gather_facts: false
  sudo: true
  sudo_user: myproject

  tasks:
  - name: Pull sources from the repository.
    git: repo={{ project_repo }} dest={{ project_root }}/code/ version={{ branch }}
    only_if: "$vm == 0"
    notify:
      - restart web frontend

  - name: Upload configuration.
    copy: src=webapp_settings/local_settings.{{ nickname }}.py dest={{ project_root }}/code/webapp/local_settings.py
    only_if: "$vm == 0"

  - name: Upgrade the virtualenv.
    pip: requirements={{ project_root }}/code/requirements.txt virtualenv={{ project_root }}/env/

  - name: Sync Django database.
    shell: {{ project_root }}/env/bin/python {{ project_root }}/code/webapp/manage.py syncdb --migrate --noinput

  - name: Generate Django media.
    shell: {{ project_root }}/env/bin/python {{ project_root }}/code/webapp/manage.py generatemedia

  handlers:
    - include: handlers.yml

What the deployment script does is simple: It pulls all the sources, uploads the machine-specific local_settings.py file, upgrades the apps in the virtualenv with the currently-specified ones, syncs/migrates the database and generates the static media. If the sources haven’t changed, ansible will make sure not to restart gunicorn.

Step 6: Handlers.

Handlers are how you can perform various post-deployment tasks, such as restarting services. Handlers only get triggered if the state of the line actually changed, so if, for example, you update the config file for nginx and want to restart it, ansible will ensure that nginx gets restarted only if the configuration actually changed.

Here is the handlers.yml file:

---
- name: restart nginx
  service: name=nginx state=restarted
  sudo_user: root

- name: reload nginx
  service: name=nginx state=reloaded
  sudo_user: root

- name: restart web frontend
  action: service name={{ project_name }}_gunicorn state=restarted
  sudo_user: root

This is pretty straightforward, and you can add whatever command you like there.

Running the thing.

We’ve finally reached the most important part: Actually running the deployment script. The ansible executable contains:

#!/bin/bash
# Run with:
# ./ansible -K --limit production provision.yml
/usr/bin/env ansible-playbook -i hosts "$@"

As the file says, you run it with ./ansible -K --limit production provision.yml, and this prompts you for the sudo password of the user you’re connecting as (a superuser) and provisions the entire production server. This will go from a server with a bare Ubuntu server installation to having your app up and running, completely ready for production use.

Similarly, you can provision the other machines, with ./ansible -K --limit staging provision.yml, ./ansible -K --limit local provision.yml, etc. When you make changes and want to deploy, the command is similar, i.e. ./ansible -K --limit production deploy.yml. This uses the hosts file we have defined, so everything is contained in the deployment directory.

Addendum: Encrypting sensitive files.

As I mentioned above, I use git-crypt to encrypt the sensitive local_settings.py files. This is because they usually contain API keys, passwords and other things that I don’t want anyone with read access to the repository to see. git-crypt does a great job of protecting these, as it only stores the files encrypted in the repository, and not in my working tree.

This means that not only any reviewer or unprivileged user won’t be able to see what the files contain, but I will also be able to sync these files across computers, or with collaborators who will need to run the latest testing API keys on their local machines (or staging) for development.

Epilogue.

This is roughly what I use to develop and deploy my projects. If anything is unclear or if you’d like more information, just leave a comment below and I’ll try to clarify. It might be a bit much to digest in one sitting, but hopefully the writeup of whole process is complete and reasonably understandable.