Setting up an Apache Cluster with Vagrant

Posted by Kaya Kupferschmidt • Wednesday, February 4. 2015 • Category: Java

Vagrant makes the perfect companion for developers that need to simulate complex cluster setups on a single machine. This is especially true when using vagrant-lxc as the container provider, which uses Linux containers instead of a full virtualisation.

Directory Structure

With the following ingredients you can setup a whole Apache Storm cluster. You can download the whole package on github. But let us look at the details. You will need the following directory structure

+ Vagrantfile
+----- provision
         +------ data
         |        + hosts
         +------ puppet
         |        |
         |        +------ manifests
         |        |        + site.pp
         |        |
         |        +------ modules
         |        + Puppetfile
         +------ scripts


As usual, the Vagrantfile contains all the important settings for Vagrant to create the containers. Note in this example that we create multiple virtual machines within a single Vagrantfile. Also note that although the IP addresses of the hosts are contained in the Vagrantfile you need to register them with the DHCP server belonging to the `lxcbr0'' ethernet bridge and manually assign the IP addresses there.

# -*- mode: ruby -*-
# vi: set ft=ruby :

boxes = [
  { :name => :nimbus, :ip => '', :memory => 512, :mac => '00:16:3e:33:44:40' },
  { :name => :supervisor1, :ip => '', :memory => 4096, :mac => '00:16:3e:33:44:41' },
  { :name => :supervisor2, :ip => '', :memory => 4096, :mac => '00:16:3e:33:44:42' },
  { :name => :zookeeper1, :ip => '', :memory => 1024, :mac => '00:16:3e:33:44:46' },

LXC_BRIDGE = 'lxcbr0'

Vagrant.configure("2") do |config|
  boxes.each do |opts|
    config.vm.define opts[:name] do |node|

      node.vm.hostname = opts[:name].to_s = "trusty64"
      node.vm.box_url = ""

      node.vm.provider :lxc do |lxc, override| = "fgrehm/trusty64-lxc"
        override.vm.box_url = ""
        lxc.container_name = "storm.%s" % opts[:name].to_s
        lxc.customize 'cgroup.memory.limit_in_bytes', opts[:memory].to_s + "M"
        lxc.customize 'network.type', 'veth'
        lxc.customize '', LXC_BRIDGE
        lxc.customize 'network.hwaddr', opts[:mac].to_s

      # install librarian-puppet and run it to install puppet common modules.
      # This has to be done before puppet provisioning so that modules are available
      # when puppet tries to parse its manifests
      config.vm.provision :shell, :path => "provision/scripts/"

      node.vm.provision :puppet do |puppet|
        puppet.manifests_path = "provision/puppet/manifests"
        puppet.manifest_file = 'site.pp'
        puppet.module_path = [ 'provision/puppet/modules-contrib', 'provision/puppet/modules' ]
        puppet.options = "--verbose --debug"


The hosts file will be copied into the containers, so that each virtual host knows the IP address of all other hosts of the Apache Storm cluster.       localhost  nimbus  supervisor1  supervisor2  zookeeper1


The Puppetfile contains information which Puppet modules are required for provisioning. We will use librarian-puppet for automatically downloading and installing the required modules. Unfortunately such functionality is not included in Puppet itself in a sane way.

# Puppetfile
# Configuration for librarian-puppet. For example:
forge ""
mod "kupferk/storm"
mod "kupferk/zookeeper"


The file site.pp contains the primary Puppet manifest used for provisioning the required software. As you can see, different hosts have different roles (depending on their hostname for the sake of simplicity).

package {puppet:ensure=> [latest,installed]}
package {ruby:ensure=> [latest,installed]}

# Make sure Java is installed on hosts, select specific version
class { 'java':
    distribution => 'jre'

# Modify global settings
class { 'storm': 
    version => '0.9.3',
    zookeeper_servers => ['zookeeper1'],
    drpc_servers => ['supervisor1', 'supervisor2'],
    nimbus_host => 'nimbus',
    supervisor_workers => '4'

node 'nimbus' {
  class { 'storm::nimbus': }
  class { 'storm::ui': }

node /supervisor[1-9]/ {
  class { 'storm::supervisor': }

node /zookeeper[1-9]/ {
  class { 'zookeeper': hostnames => [ $::fqdn ],  realm => '' }

Finally we need the script which will be executed by Vagrant for provisioning the containers. Make sure the file is executable!


# Directory in which librarian-puppet should manage its modules directory

cp -fv /vagrant/provision/data/hosts /etc/hosts
apt-get update
apt-get --yes --force-yes install puppet rubygems-integration

if [ "$(gem search -i librarian-puppet)" = "false" ]; then
  gem install librarian-puppet
  cd $PUPPET_DIR && librarian-puppet install
  cd $PUPPET_DIR && librarian-puppet update

Start the cluster

vagrant up

Be happy.

Comments Comments

Display comments as (Linear | Threaded)
  1. No comments

Add Comment

Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
Standard emoticons like :-) and ;-) are converted to images.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.

Markdown format allowed