Bee2: Creating a Small Infrastructure for Docker Apps

October 22, 2017, 10:00 pm

≪ Previous: Bee2: Wrestling with the Vultr API

In a previous post, I showed how I wrote a provisioning system for servers on Vultr. In this post, I’m going to expand upon that framework, adding support for Firewalls, Docker, a VPN system and everything needed to create a small and secure infrastructure for personal projects. Two servers will be provisioned, one as a web server running a docker daemon with only ports 80 and 443 exposed, and a second that establishes a VPN to connect securely to the docker daemon on the web server.

Vultr API Recap

In my first iteration of Bee2, I was using the vultr.rb ruby gem for API calls. I discovered and fixed a small bug but then ran into issues with Vultr’s API rate limitations. Instead of limiting requests per minute, Vultr apparently has a rate limit of two requests per second¹. This normally isn’t a problem, except when creating the DNS records on a low latency Internet connection. I couldn’t really recover from rate limits with the current sets of v() and vv() vultr.rb function wrappers I used previously, so I decided to remove the vultr.rb dependency and implement the actual API requests myself using the following.

  def request(method, path, args = {}, ok_lambda = nil, error_code = nil, err_lambda = nil)
    uri = URI.parse("https://api.vultr.com/v1/#{path}")
    https = Net::HTTP.new(uri.host, uri.port)
    https.use_ssl = true

    req = case method
    when 'POST'
      r = Net::HTTP::Post.new(uri.path, initheader = {'API-Key' => @api_key })
      r.set_form_data(args)
      r
    when 'GET'
      path = "#{uri.path}?".concat(args.collect { |k,v| "#{k}=#{CGI::escape(v.to_s)}" }.join('&'))
      Net::HTTP::Get.new(path, initheader = {'API-Key' => @api_key })
    end

    res = https.request(req)

    case res.code.to_i
      when 503
        @log.warn('Rate Limit Reached. Waiting...')
        sleep(2)
        request(method, path, args, ok_lambda, error_code, err_lambda)
      when 200
        if not ok_lambda.nil?
          ok_lambda.()
        else
          if res.body == ''
            ''
          else
            JSON.parse(res.body)
          end
        end
      else
        if not error_code.nil? and res.code.to_i == error_code
          err_lambda.()
        else
          @log.fatal('Error Executing Vultr Command. Aborting...')
          @log.fatal("#{res.code} :: #{res.body}")
          exit(2)
        end
    end
  end

The new request() takes a path, arguments, and optional functions to call in case of success or failure with a specific failure code. Within the function it checks for 503 errors (rate limits) and pauses for two seconds before retrying them. The previous calls to vultr.rb looked like the following with the vv() and v() wrappers:

vv(Vultr::RevervedIP.attach({'ip_address' => ip, 'attach_SUBID' => subid}), 412, -> {
  @log.info('IP Attached')
}, -> {
  @log.warn('Unable to attach IP. Rebooting VM')
  v(Vultr::Server.reboot({'SUBID' => subid}))
})

They can now be replaced with request() like so:

request('POST', 'reservedip/attach', {'ip_address' => ip, 'attach_SUBID' => subid}, -> {
  @log.info('IP Attached')
}, 412, ->{
  @log.warn('Unable to attach IP. Rebooting VM')
  request('POST', 'server/reboot', {'SUBID' => subid})
})

Infrastructure Overview

Simple Personal Application Infrastructure

In my setup, I plan to have one Ubuntu web application server which only has ports 80 and 443 exposed. It will be running a docker daemon that I can connect to security over the private/VPN network. OpenVPN will be running on a FreeBSD node which will only have SSH and OpenVPN ports exposed. Most of this configuration is achieved using Ansible playbooks. I’ve moved the configuration file into the conf directory, which is included in the .gitignore file. This directory will also contain both the OpenVPN keys/certs and Docker keys/certs generated via the Ansible playbooks. An example configuration can be found in examples/settings.yml.

Provisioning, rebuilding and configuring the servers can be run using the following:


# [-r] deletes servers if they exist
./bee2 -c conf/settings.yml -p -r

# The first time DNS is configured, you may need to wait
# before running this as Ansible uses the public DNS names
./bee2 -c conf/settings.yml -a public

The first time server configuration is run using -a, it must be run against the public inventory which uses the server’s internet facing IP addresses. The Ansible roles create a VPN server, but also establish all the firewalls. Therefore, subsequently Ansible provisioning requires a running OpenVPN client before running bee2 with the -a private argument.

OpenVPN

OpenVPN is established via an Ansible role that generates the server Certificate Authority (CA), the server key and cert pair, and the client cert and key. The role has only been tested on FreeBSD and can be found in ansible/roles/vpn. It can be configured in the bee2 settings file using the following:

openvpn:
    hosts:
      gateway: 192.168.150.20
    server:
      subnet: 10.10.12.0
      routes:
        - 192.168.150.0 255.255.255.0
      netmask: 255.255.255.0
      cipher: AES-256-CBC
    clients:
      laptop: type: host

The gateway should be the private address of the machine in the server section of the configuration file which has the freebsd-vpn.yaml playbook. Bee2 will push a route for the VPN subnet to all the other servers listed under the servers section of the configuration. Client keys will be copied locally to conf/openvpn-clients.

There is an example openvpn client configuration located at examples/openvpn.conf. The setup for OpenVPN can vary per Linux distribution. I recommend installing OpenVPN using your package manager (apt-get, yum, zypper, emerge, etc.) Many distributions support multiple OpenVPN clients, although you may have to create new systemd targets or symbolic links within the init system. Typically, configuration files go in /etc/openvpn along with the keys and certificates found in conf/opvn-clients after the bee2 Ansible provisioners have run.


# Establish the OpenVPN server and Firewalls

./bee2 -c conf/settings.yml -a public

# Copy keys, certificates and config

sudo cp conf/openvpn-clients/* /etc/openvpn/
sudo cp examples/openvpn.conf /etc/openvpn/openvpn.conf

# edit configuration for your setup
$EDITOR /etc/openvpn/openvpn.conf

# Start the service (systemd)

systemctl start openvpn.server

# or, start the server on sysv/openrc

/etc/init.d/openvpn start

Once OpenVPN has started, attempt to ping the private addresses or DNS names of your services. If that fails, check the OpenVPN logs to diagnose any potential issues.

Firewall

Firewalls are one of the last pieces to be configured in the Ansible scripts. The Ubuntu firewall uses the ufw Ansible module and is fairly straightforward.


- name: Enable ufw
  ufw: state=enabled policy=allow

- name: Allow ssh internally
  ufw: rule=allow port=22 direction=in proto=tcp interface={{ private_eth }}

- name: Allow Docker internally
  ufw: rule=allow port=2376 direction=in proto=tcp interface={{ private_eth }}

- name: 80 is open
  ufw: rule=allow port=80 proto=tcp

- name: 443 is open
  ufw: rule=allow port=443 proto=tcp

- name: Disable default in
  ufw: direction=incoming policy=deny
  async: 0
  poll: 10
  ignore_errors: true

The final task that is run disables incoming connections, therefore it’s run in async mode and ignores errors, essentially turning this into a fire and forget task. Ansible preforms each task in its own SSH connection, so the firewall role needs to be the very last role that is run as any subsequent tasks will fail.

FreeBSD has several firewall options. For the Ansible role in Bee2, I decided to go with use pf and configure the firewall via the /etc/pf.conf configuration file. Special thanks to Tom Trebick on ServerFault in debugging my firewall configuration. The following is the pf.conf.j2 Ansible template.


# {{ ansible_managed }}

block all

# allow all from host itself
pass out inet all keep state
pass out inet6 all keep state

# allow all from private
pass in quick on {{ private_eth }} inet from any to any keep state

# openvpn
pass in quick proto udp to vtnet0 port openvpn keep state
pass in quick on tun0 inet from any to any keep state

# ssh
pass in quick proto tcp to vtnet0 port ssh flags S/SA keep state

This firewall configuration allows all traffic on the private network, all OpenVPN traffic (via the tun0 adapter), all outgoing traffic over IPv4/IPv6 and incoming traffic for only SSH and OpenVPN.

I attempted to use the same Ansible options to start the firewall on FreeBSD which I had on my Ubuntu/Docker VM.

- name: Enable Firewall Service
  service: name=pf state=started enabled=yes
  # perform this without waiting for the response because PF will drop the
  # SSH connection if its service is not running
  async: 0
  poll: 10
  ignore_errors: true

Unfortunately, even with ignore_errors and async set, I’d still get hangs and timeouts on this particular task the first time it’s run. I eventually added the following to an ansible.cfg located in the base of the Bee2 project:

[ssh_connection]
ssh_args = -o ServerAliveInterval=10

The first time the Firewall tasks are run, Ansible will still report an error, but the timeout occurs within a reasonable amount of time and the rest of the Ansible tasks continue. Subsequent runs will run fine without any errors, but will require running against the private Ansible inventory with an OpenVPN connection established.

TASK [firewall : Enable Firewall Service] *********************************************************************
fatal: [bastion.sumdami.net]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: Timeout, server bastion.sumdami.net not responding.\r\n", "unreachable": true}

SSH Keys

When using -r to rebuild a machine, new VMs will have different SSH host keys. To avoid warning messages on rebuilds, delete_server() calls the following function to remove both the IP and hostname keys from the ~/.ssh/known_hosts file:

private def remove_ssh_key(host_or_ip)
  @log.info("Removing SSH Key for #{host_or_ip}")
  Process.fork do
    exec('ssh-keygen', '-R', host_or_ip)
  end
  Process.wait
end

When configuring a server for the first time, Bee2 runs the ssh-hostkey-check.yml Ansible playbook. Based on a post by mazac on ServerFault, it automatically adds new SSH keys without prompting, but will return an error if a key exists and is incorrect.


---
- name: accept ssh fingerprint automatically for the first time
  hosts: all
  connection: local
  gather_facts: False

  tasks:
    - name: Check if known_hosts contains server's fingerprint
      command: ssh-keygen -F {{ inventory_hostname }}
      register: keygen
      failed_when: keygen.stderr != ''
      changed_when: False

    - name: Fetch remote SSH key
      command: ssh-keyscan -T5 {{ inventory_hostname }}
      register: keyscan
      failed_when: keyscan.rc != 0 or keyscan.stdout == ''
      changed_when: False
      when: keygen.rc == 1

    - name: Add ssh-key to local known_hosts
      lineinfile:
        name: ~/.ssh/known_hosts
        create: yes
        line: "{{ item }}"
      when: keygen.rc == 1
      with_items: '{{ keyscan.stdout_lines|default([]) }}'

This implementation avoids having to disable StrictHostKeyChecking in SSH, preserving host key verification. In theory, a man in the middle attack could still occur between the provisioning and the configuration phases, although it’s unlikely. For truly paranoid or security conscious individuals, you can connect to the VMs via the Vultr HTTP console and verify the SSH host key fingerprints are correct.

Passwords

By default, the Vultr auto-generates a password for new servers, which are retrievable via their API. For security purposes, we should replace those password with generated ones we save and encrypted via a PGP key. If a security section with a valid PGP id is added to the configuration YAML file, the root-password Ansible role will use pwgen to create a new password, set it as the root user on the VM, and encrypt that password and keep in in ~/.password-store/bee2 so it can be accessible using the pass command.

security:
  pgp_id: ADFGTE59

$pass
Password Store
└── bee2
    ├── web1
    └── vpn

Remote Access to Docker

The Docker client typically talks to the Docker daemon via a local UNIX socket in most default installations. Connecting remotely to docker requires creating a local CA, keys and signing certificates, similar to configuring OpenVPN clients. The official Docker documentation has an excellent guide for protecting the Docker daemon socket², which I converted into an Ansible role. The generated client keys and certs are placed in conf/docker.

Docker can listen to both remote and local sockets, but all the configuration must be in one place. You cannot mix command line arguments and configuration from the JSON file. The -H fd:// switch needs to be removed from the systemd target file using the following set of Ansible tasks:

- name: Disable Local Socket Access for Docker
  lineinfile:
    dest: /lib/systemd/system/docker.service
    regexp: '^ExecStart=.*\$DOCKER_OPTS'
    line: "ExecStart=/usr/bin/dockerd $DOCKER_OPTS"
  register: systemd_target_update

- name: Reload Systemd targets
  command: systemctl daemon-reload
  when: systemd_target_update.changed

Note that you cannot simply restart the Docker service after modifying the systemd target. systemctl daemon-reload must be called whenever the target files are modified or else the changes will not be picked up. This is another systemd gotcha that doesn’t appear in sane initialization systems.

The Ansible role also configures DOCKER_OPTS in /etc/default/docker to use a JSON configuration file like so:


DOCKER_OPTS="--config-file {{ docker_daemon_conf }}"

The JSON configuration file specifies the TLS keys and certificates, and allows access both locally and from the private network accessible via OpenVPN.


{
	"tls": true,
	"tlsverify": true,
	"tlscacert": "{{ docker_ca }}",
	"tlscert": "{{ server_crt }}",
	"tlskey": "{{ server_key }}",
	"hosts": ["127.0.0.1:2376", "{{ private_ip }}:2376"]
}

After the Ansible playbooks have been run, all the files for remote Docker authentication should be in the conf/docker directory. Once the OpenVPN client, from the previous section, has been configured correctly, you should be able to connect to the Docker daemon using the following command:

docker --tlsverify --tlscacert=conf/docker/ca.crt  --tlscert=conf/docker/docker-client.crt  --tlskey=conf/docker/docker-client.pem  -H=web1.example.net:2376 version

Conclusions

In this iteration of the Bee2 project, I’ve expanded the Ansible configuration to setup an OpenVPN server, establish server firewalls and configure Docker to be remotely accessible. I now have everything I need to securely run Docker applications over my VPN connection onto a publicly facing web server. The specific version of Bee2 used in this article has been tagged as pd-infra-blogpost, and the most current version of Bee2 can be found on Github. Future enhancements will include running Docker applications, as well as backing up and restoring application data.

Vultr API Rate limiting problem #12 - janeczku/docker-machine-vultr. 19 April 2016. Github. ↩
Protect the Docker daemon socket. Retrieved 25 September 2017. Docker. Archived Version ↩

↧

Password Algorithms

October 23, 2017, 10:00 pm

≫ Next: Bee2: Automating HAProxy and LetsEncrypt with Docker

≪ Previous: Bee2: Creating a Small Infrastructure for Docker Apps

Sometime in 2008, MySpace had a data breach of nearly 260 million accounts. It exposed passwords that were weakly hashed and forced lowercase, making them relatively easy to crack¹. In 2012, Yahoo Voice had a data breach of nearly half a million usernames and unencrypted passwords². Now you may think to yourself, “I don’t care. I never use my old MySpace or Yahoo account,” but in the case of the Yahoo data breach, 59% of users also had an account compromised in the Sony breach of 2011, and were using the exact same password for both services³!

Using leaked usernames and passwords from one service to attempt to gain entry to other services is known as credential stuffing. People should use a different password for every website or service. Password reuse is one of the major ways online accounts become compromised. For the average person, using a password manager to generate unique passwords for every website and app may seem a bit cumbersome or complicated. But there is another way to have unique passwords for every website; passwords that can easily be remembered, yet are difficult to guess. The solution, often discouraged by security experts, is creating a password algorithm.

Creating a Password System

A password algorithm is simply a set of steps a person can easily run in his or her head to create a unique password for a website or mobile app. This lets people derive a password instead of having to memorize many complex passwords or use a password manager. The algorithm doesn’t have to be very complex. For example, here’s a simple one based on website domain names:

Take the first three letters in a website’s domain name
Move one letter backwards for the first two letters (e.g. D would become C, K would become J, A would become Z, etc.)
Capitalize the 3rd letter
Add the letters B1a3k (Black with two letters transposed to numbers)
Add a hash (#)
Count the number of letters in the domain name of the website and add it
Add a period
Count the number of letters in the TLD (e.g .com and .net would be 3, co.uk would be 4, .co would be 2)

With this hypothetical algorithm a Google.com login would be derived like so:

Goo
Fno
FnO
FnOB1a3k
FnOB1a3k#
FnOB1a3k#6
FnOB1a3k#6.
FnOB1a3k#6.3

Using the same algorithm, let’s create a password for Penguindreams.org:

Pen
Odn
OdN
OdNB1a3k
OdNB1a3k#
OdNB1a3k#13
OdNB1a3k#13.
OdNB1a3k#13.3

Please don’t actually use this specific algorithm. It’s just an example to help you come up with your own, yet I hope you can see the usefulness of the result. With a good password algorithm, you can consistently generate long passwords, with special characters, that are unique for every website and service, and are difficult to guess yet easy to derive. Algorithms may seem overwhelming at first, but once you come up with a solid one and start to use it, passwords become easier and easier to derive or recall. For the most common websites you use, you will begin to memorize those specific passwords over time without having to derive them.

If you want to store these password for reference, use a password manager which fully encrypts all your passwords using a highly secure master password. If you want to use something simpler like a spreadsheet, don’t save the actual password, but instead simply store the name of the website, app or service, along with a name for the password algorithm, which should have nothing to do with the algorithm itself. You can also have a notes field for exceptions, for example a website which doesn’t allow your special character or has other odd password requirements. Since changing password algorithms can be a difficult and intense process, choosing a good initial algorithm is important. Here are some general points for designing your password structure:

Create an algorithm that will always generate a long password, preferably over twelve characters in length.
Your algorithm should always generate complex passwords. Try to include at least one number, one capital letter and one special character.
Try to come up with a system that is easy for you to remember, but would require considerable amount of time and several compromised passwords to figure out.
Your algorithm should make sense to you, but the resulting password should appear similar to a randomly generated one in case it’s compromised by an attacker.
Don’t ever explain your password algorthim to anyone, not even family members. Keep it safe and secure.
If you need to dilvulge a password to someone, be sure to change it afterwards. You can come up with an algorthim to rotate your passwords for systems where they expire (e.g. by adding an incrementing number in the middle).

Why Not to use Password Algorithms

A good password system allows for different password for every service one uses, without the need for looking them up in a password manager, web browser extension or mobile app. However, they’re not commonly encouraged within the security industry because they do have several weaknesses.

If someone is specifically targeting you and gets a hold of several passwords, a weak algorthim could allow an attacker to reverse engineer your system and find their way into your other accounts.
A weak algorthim that doesn’t produce enough characters could lead to passwords that can be easily cracked upon data breaches.
Many sites have odd password requirements and may not accept long passwords or special characters used in your algorthim, leading to having to note and remember many exceptions.
Changing a password algorthim can be tedious if and when it needs to be done.

Having a spreadsheet of sites and their algorthims can help if you ever need to replace your algorthim. I personally have had three different algorthims throughout my life. I tend to only change a password to a new algorthim on sites I use commonly. If I discover an old system using an older algorthim, I’ll immediately change it and note it in my spreadsheet. A disadvantage of storing all this data in a spreadsheet is that if an attacker gets access to that document, they do now have a list of sites where you have an account, even if they don’t know any of your passwords. A more secure alternative to a spreadsheet is using a password manager, not nescesserally for the passwords themselves, but just the name, and possible exceptions, for each algorthim.

Security Questions

Security questions are another area where one needs to be careful. Many people have seen posts on social media asking people to list the street they grew up on and their first car, making that their “stripper name”. The results may seem comical, but they are also giving others access to potential security question answers. It’s best to use passwords for your security questions as well, either randomly generated via a password generator, or using another algorthim.

Unfortunately, even using passwords for security question answers may not mitigate all risks. A common trick in social engineering is to call a customer service representative. When asked a security question, an attacker may simply say, “I just typed in random characters,” if they know you use passwords for security questions. Some security experts recommend prefixing answers with something similar to “Accept only this exact string. It is a password: xdfge$#0,” although some websites will not allow you to put spaces or long strings in security quetion answers. I remember one time, when I was at a bank teller opening an account, the attendant asked, “Your mother’s maiden name has numbers in it?” to which I replied, “You use real answers for security questions instead of passwords? Aren’t you worried about identiy theft?”

There are no really good answers to security questions. You should pick a concept, either using generated passwords from a password manager or a unique security question algorthim, to try and mitigate attacks based on security questions and password resets. However be aware that good social engineering and talking to customer service representatives can potentially compromise accounts, even those with carefully crafted security question answers and two factor authentication.

Conclusions

When someone uses your information to create accounts or access existing accounts based on your credentials, it’s not simply identity theft. It’s outfright fraud. Strong passwords alone won’t protect you entirely, but they do go a considerable way to mitigating the potential damage from hackers. Security works in layers, and adding two-factor authorization, keeping e-mail addresses up to date and checking e-mail for account alerts can all help add layers of protection to keep your online presence secure.

Hacker Tries To Sell 427 Milllion Stolen MySpace Passwords For $2,800. 27 May 2016. Franceschi-Bicchierai. Motherboard. ↩
Have I been powned?. Troy Hunt. Retrieved 24 Oct 2017. ↩
What do Sony and Yahoo! have in common? Passwords!. 12 July 2012. Troy Hunt. ↩

↧

Bee2: Automating HAProxy and LetsEncrypt with Docker

November 29, 2017, 10:00 pm

≪ Previous: Password Algorithms

In a previous post, I introduced Bee2, a Ruby application designed to provision servers and setup DNS records. Later I expanded it using Ansible roles to setup OpenVPN, Docker and firewalls. In the latest iteration, I’ve added a rich Docker library designed to provision applications, run jobs and backup/restore data volumes. I’ve also included some basic Dockerfiles for setting up HAProxy with LetsEncrypt and Nginx for static content. Building this system has given me a lot more flexibility than what I would have had with something like Docker Compose. It’s not anywhere near as scalable as something like Kubernetes or DC/OS with Marathon, but it works well for my personal setup with just my static websites and personal projects.

Bee2 and Docker

In this iteration of Bee2, I’ve added sections in the configuration file for docker, applications, and jobs. The docker section contains general settings, such as which volumes to backup and what prefix to use for Bee2 managed containers. Both applications and jobs are for configuring Docker containers. Applications are containers that run continuously and jobs are tasks that are designed to be run once and exit, such as building content for a website.

Starting with the docker section, we have a prefix that will be appended to all the containers that are created (which defaults to bee2 if it’s omitted), and a backup section listing each server, the named volumes which should be backed up and the location to store the resulting tar files.

docker:
  prefix: bee2
  backup:
    web1:
      storage_dir: /media/backups
      volumes:
        - letsencrypt
        - logs-web

Next we have jobs and applications. Items listed in the jobs section are checked out to the local machine from a given git repository. The Dockerfile in the base of the git repository is built and run on the given machine (in this case, web1). The following example builds the static website for dyject, a Python library I wrote for dependency injection. It writes its output to a volume which is accessible by the Ngnix container as we’ll see later.

jobs:
  dyject:
    server: web1
    git: git@github.com:/sumdog/dyject_web.git
    volumes:
      - dyject-web:/dyject/build:rw

The applications section contains a list of docker applications and their configurations. Each application is given a user defined name, a build_dir (which references a directory in dockerfiles in the Bee2 source), environment variables, Docker volumes, linked containers and exposed ports.

The environment variable domains can potentially have the special keyword all, which is processed into a list of all domains being used by all applications on a given server. Because this lists needs to be passed as an environment variable to docker containers, it’s formatted as a space separated list, with each entry being a server, a colon and a coma separated lists of domains belonging to that server, as show below:

DOMAINS="bee2-app-name1:example.com,example.org bee2-app-name2:someotherdomain.com"

This highlights one of the fundamental issues with Docker, in that each container is expected to be configured using environment variables. For more complex configurations, it might make more sense to pass a JSON string as an environment variable, but that would require each container having tools within it to deserialize the passed in JSON.

The following applications section configures an HAProxy instance with publicly exposed HTTP/HTTPs ports, a Certbot to issue LetsEncrypt certificates and an Nginx instance to serve static web content:

applications:
  certbot:
    server: web1
    build_dir: CertBot
    volumes:
      - letsencrypt:/etc/letsencrypt:rw
      - /var/run/docker.sock:/var/run/docker.sock
    env:
      email: blackhole@example.com
      test: false
      domains: all
      port: 8080
  nginx-static:
    server: web1
    build_dir: NginxStatic
    env:
      domains:
        - dyject.com
      http_port: 8080
    volumes:
      - dyject-web:/www/dyject.com:ro
      - logs-web:/var/log/nginx:rw
  haproxy:
    server: web1
    build_dir: HAProxy
    env:
      domains: all
    link:
      - nginx-static
      - certbot
    volumes:
      - letsencrypt:/etc/letsencrypt:rw
    ports:
      - 80
      - 443

Bee2 communicates with Docker over the VPN tunnel that was configured in the last tutorial. Once the servers and provisioned and configured, the docker containers can be run using the following commands:

./bee2 -c conf/settings.yml -d web1:build
./bee2 -c conf/settings.yml -d web1:run
./bee2 -c conf/settings.yml -d web1:backup

To run or rebuild a specific container instead of every container listed in the configuration file, that container can be appended to the end of the command.

./bee2 -c conf/settings.yml -d web1:rebuild:haproxy
./bee2 -c conf/settings.yml -d web1:run:dyject

State information is stored using the backup command. Backups are timestamped, and running restore will pull the latest backup available in the location specified by storage_dir. An entire infrastructure stack can be rebuilt from scratch while maintaining state information by running commands like the following:


# Backup existing state
./bee2 -c conf/settings.yml -d web1:backup

# Provision and rebuild the servers
./bee2 -c conf/settings.yml -p -r

# Configure servers using Ansible
./bee2 -c conf/settings.yml -a public

# Update OpenVPN with the new keys
sudo cp conf/openvpn/* /etc/openvpn

# Restart OpenVPN (varies per Linux distribution)
sudo systemctl restart openvpn.service # systemd restart
sudo /etc/init.d/openvpn restart       # sysvinit restart
sudo sv restart openvpn                # runit restart

# Docker commands to restore state and rebuild containers
./bee2 -c conf/settings.yml -d web1:restore
./bee2 -c conf/settings.yml -d web1:build
./bee2 -c conf/settings.yml -d web1:run

A full list of Docker commands can be found by running ./bee2 -c conf/settings.yml -d help.

Under the Hood

Certbot

I’m extending the HAProxy container maintained by Docker and the official Certbot container maintained by the EFF. I try to use official containers maintained either by Docker or the project owners whenever possible. There are many HAProxy+Certbot custom container implementations currently out there, most of which place both services within the same container and then run both of them using some type of supervisor. This is necessary since HAProxy requires a signal to indicate it should reload when the SSL/TLS certificates are updated by Certbot. This seems to go against the generally accepted best-practice of isolating Docker containers to only one process.

certbot:
  server: melissa
  build_dir: CertBot
  volumes:
    - letsencrypt:/etc/letsencrypt:rw
    - /var/run/docker.sock:/var/run/docker.sock
  env:
    email: notify@battlepenguin.com
    test: false
    domains: all
    port: 8080
    haproxy_container: $haproxy

Taking a closer look at the Certbot container configured above, I place all the certificates on a volume so that they can be shared with HAProxy and be backed up. Environment variables that start with a dollar sign ($) will be replaced with the full name of a container including the prefix and container type. In this case, $haproxy will be replaced with bee2-app-haproxy. As stated earlier, the special variable all, when used with domains will be replaced with a full list of all domains associated with a given server.

In this configuration, the Docker socket is shared to the Certbot host. This is not secure nor recommended. (Although it is considered an accepted answer on StackOverflow.) If Certbot is ever compromised, an attacker could have complete access to the Docker daemon and other running containers. In this configuration, it’s essential to keep the Cerbot container up to date with the latest image to prevent any potential security vulnerabilities. In future releases, I hope to add a container with a proxy service, designed to preform necessary container tasks and provide another layer of isolation between the Docker socket and services which need to communicate to other containers.

The Docker socket is used for two things. It checks to see if HAProxy is up and running before launching Cerbot and it reloads HAProxy when certificates have been renewed. Checking to see if a container is active is done using the check_docker python script. Signaling HAProxy to reload is done by using nc like so¹:

echo -e "POST /containers/$HAPROXY_CONTAINER/kill?signal=HUP HTTP/1.0\r\n" | \
nc -U /var/run/docker.sock

HAProxy

The official HAProxy Docker image uses a systemd wrapper binary as its entry point in order to pass signals sent to the container to the underlying HAProxy process. The documentation recommends copying in a custom haproxy.cfg or using one from a mounted volume. Since I’m generating my configuration dynamically when the container starts, I had trouble figuring out the best way to run my script and still be able to forward signals to HAProxy. After a few failed attempts at trying to handle my own signals and forward them on, either in Bash or Python scripts, I eventually I settled on a custom ENTRYPOINT script that would make an exec call to the base containers ENTRYPOINT like so:

echo "Running HAProxy Config"
python /haproxy-config.py

echo "Running Base Wrapper Script"
exec /docker-entrypoint.sh "$@"

This allows me to run my custom haproxy-config.py script which sets up virtual hosts, SNI and Certbot, and then exec out to the docker-entrypoint.sh. This replaces my process with the stock binary signal manager that comes with the official Docker container. I call my startup script within the Dockerfile like so:

ENTRYPOINT ["/startup"]
CMD ["haproxy", "-f", "/usr/local/etc/haproxy/haproxy.cfg"]

Nginx

All my static content is hosted via nginx. The HAProxy script will create vhost/SNI entries for every domain in the domain map passed to it, using port 8080 on the destination container. HAProxy handled SSL redirects and offloads the SSL traffic. The following configuration, modified from a stackoverflow answer, will atomically direct virtual host requests to an appropriate folders located at /www/<domain name> and issue 301 redirects from the www subdomain back to the root. The option port_in_redirect off needs to be used to ensure nginx doesn’t add port 8080 to folder name redirects.

server {
  listen 8080;
  server_name ~^(www\.)(?<domain>.*)$;
  return 301 https://$domain$request_uri;
}

server {
    listen 8080 default_server;

    server_name _default_;
    root "/www/$http_host";
    port_in_redirect off;

    access_log "/var/log/nginx/$http_host.log" main;

    location ~ /\.ht {
        deny all;
    }
}

The nginx Dockerfile is fairly straightforward as well. By default, the base nginx container sends all of its access and error logs to /dev/stdout and /dev/stderr as it is considered best practice with Docker. I’ve modified nginx’s configuration to keep separate log files for each individual host, so that they can be run through a log parser later for analytic purposes. Since nginx worker threads run as an unprivileged user within the container, by default nginx doesn’t have permission to write to its own log directory. However, if permissions are set in the Dockerfile for the directory that the Docker log volume will be mounted into, those permissions carry over to the mount itself, allowing nginx to store its logs and also have those logs available to other containers.

FROM nginx:1.13.6

COPY nginx.conf /etc/nginx/nginx.conf
RUN chown -R nginx:nginx /var/log/nginx

EXPOSE 8080

VOLUME ["/var/log/nginx", "/var/www"]

IPv6 Support

By default, Docker’s exposed ports do not listen on both the IPv4 and IPv6 addresses of the host. Configuring Docker to listen on IPv6 involves giving the Docker daemon a slice of the host’s subnet². I acomplished this by taking the subnet provided to me by my hosting provider, adding a /80 to it and adding it to the daemon.json configuration file using the existing Ansible role and template:


{
	"tls": true,
	"tlsverify": true,
	"tlscacert": "{{ docker_ca }}",
	"tlscert": "{{ server_crt }}",
	"tlskey": "{{ server_key }}",
	"ipv6": true,
	"fixed-cidr-v6": "{{ state.servers[ansible_hostname].ipv6.subnet }}/80",
	"hosts": ["127.0.0.1:2376", "{{ private_ip }}:2376", "fd://"]
}

Final Thoughts

Bee2 was never really meant to a general purpose provisioning system. I attempted to use existing tools such as Terraform and Docker Compose, but had trouble with some of their limitations. It’s primary purpose was to aid in the migration of my personal websites and projects from my existing hosting provider, which requires considerable manual configuration, to an automated provisioning system. Although only targeting one provider currently, creating my services in code should ease in further migrations, as well as quickly spinning up new applications to experiment with.

The initial Docker work was based in part of the frame work used in my side project BigSenseTester, where I use a Ruby script to create Docker containers and run the automated integration tests for BigSense. The vSense project also had configuration management tools for HAProxy and Certbot. Although I was able to leverage some of this existing work, there was still a considerable amount I needed to add and adapt for this particular iteration of Bee2.

Working on Bee2 has given me a deep appreciation to all the intricacies involved in writing provisioning services and automating configuration management. Although I’ve gained a lot of flexibility by writing a custom application to take care of my specific desires in a provisioning tool, it comes at the expense of the time needed to develop it, instead of the future projects I hope to host with it. Hopefully the work I’ve done can help others in developing their own development and operation tools, as well as speed up development of my own projects in the future.

Sending signals from one docker container to another. 8 February 2015. LordElph’s Ramblings. ↩
Help me understand IPv6 and docker. Docker Community Forums. Retrieved 14 November 2017. ↩

↧