Create a VM Image With Apache Kafka Configured Using Vagrant and Ansible

The Aim

Longer Term

As this series progresses we are aiming to create a running “development level quality” environment with the click of a button – or less! – representing some kind of interpretation of the fabled, and often elusive Infrastructure As Code dreamscape.

For example we should, from a standing start, be able to create an environment consisting of:-

  • 1 VM hosting a 1 Kafka Node Cluster (if that is not an oxymoron).
  • A 3-5 node Kubernetes cluster where the services are deployed
  • 1 VM hosting a DB of sorts … maybe Cassandra or MongoDB .. we’ll see

Shorter Term – i.e. This Article

Okay, so lets say we’re in the land of amorphous micro-services. That means we’re going to be using some kind of messaging broker, yeah … and for us this will be Apache Kafka.

Today we’re going to create a virtual machine which has Apache Kafka installed and configured, including the creation of our desired topics for inter service communication.

Tools and Demo Environment

The required toys which you’ll need to install on your local machine are :-

  • Vagrant : Local creation and testing of image
  • VirtualBox : To allow Vagrant to build VM locally
  • Ansible : Low level configuration of VM, specifically configuring Kafka in this case

Note: Please install the latest packages above so they play nicely. For example, versions of Vagrant and VirtualBox are usually behind if using say a Linux Distro’s repository. Certainly true for me, your mileage may vary.

Accompanying Code

For lazy developers you can checkout the code at same titled github repo.

Lets Go!

Vagrant Ascend

Working with Vagrant initially, as opposed to jumping straight to creating images tailored for your favourite cloud provider, ensures our VM works locally in the first instance. This will, or at least should, tease out most of the config style issues you’re likely to hit.

mkdir myKafkaVm; cd myKafkaVm; export BLOG_DIR=`pwd`

Then create a file called Vagrantfile with the following contents

Vagrant.configure("2") do |config|
  config.vm.box = "ubuntu/trusty64"
  config.vm.network "forwarded_port", guest: 9092, host: 9092

  config.vm.provider "virtualbox" do |vb|
    vb.memory = "2048"
  end
end

For more specific details on the Vagrantfile config choices see the documentation but basically we’re saying:-

  1. Pull down the basic vm image “ubuntu/trusty64” from HashiCorp’s Atlas repository – assuming of course you don’t have a similarly named locally installed box which would then take precedence.
  2. Forward traffic from your local machine to the VM when it is spun up on 9092 – Kafka’s default listening port. This isn’t strictly needed in this article but if you wanted local prcoesses (on your host machine) to interact on the usual kafka ports then it would be.
  3. Use virtual box to provision the VM, and give it 2GB memory. Any less and Kafka may fail to start up.

At this stage we can build a vanilla vm box. If you are in the directory with Vagrantfile do …

vagrant up

You should see something like …

Bringing machine 'default' up with 'virtualbox' provider...
==> default: Box 'ubuntu/trusty64' could not be found. Attempting to find and install...
default: Box Provider: virtualbox
default: Box Version: >= 0
==> default: Loading metadata for box 'ubuntu/trusty64'
default: URL: https://atlas.hashicorp.com/ubuntu/trusty64
==> default: Adding box 'ubuntu/trusty64' (v20170220.0.1) for provider: virtualbox
default: Downloading: https://atlas.hashicorp.com/ubuntu/boxes/trusty64/versions/20170220.0.1/providers/virtualbox.box
==> default: Successfully added box 'ubuntu/trusty64' (v20170220.0.1) for 'virtualbox'!
:

Once complete, error free, you can now ssh onto your vm…

vagrant ssh

and check it is correct box once logged on, e.g.

$ uname -a
Linux vagrant-ubuntu-trusty-64 3.13.0-110-generic #157-Ubuntu SMP Mon Feb 20 11:54:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Okay so that’s great. Lets get Kafka installed on this.
Firstly log off your VM and then destroy it.

vagrant destroy -f

I prefer to rebuild from scratch each time rather than vagrant’s provision to update any changes as it is more true to what you will build at the packer stage, next in the series.

Enter Ansible

Obviously this article is short in theory – if you want a solid introduction into Ansible I’d recommend Michael Heap’s book Ansible, From Beginner to Pro.

We want to install Zookeeper and Kafka onto our VM automatically when we do vagrant up and as such there is a plethora of configuration tools to choose from; Chef, Puppet and SaltStack are a few good options. They are all fine and dandy, but in our case we will be using Ansible.

Point Vagrant to Ansible for Provisioning

Add ansible provisioning lines to your Vagrant file so that it looks like …

Vagrant.configure("2") do |config|
  config.vm.box = "ubuntu/trusty64"
  config.vm.network "forwarded_port", guest: 9092, host: 9092

  config.vm.provider "virtualbox" do |vb|
    vb.memory = "2048"
  end

  ## New lines below to say "Hey, let's use Ansible my wandering vagrant friend!"
  config.vm.provision "ansible" do |ansible|
    ansible.playbook = "provisioning/playbook.yml"
  end
end

Install Zookeeper On The VM via Ansible

We need to create the Ansible playbook.

cd $BLOG_DIR;  mkdir provisioning; touch provisioning/playbook.yml

And then make the playbook.yml look like …

---
- hosts: all
  become: true
  roles:
    - role: AnsibleShipyard.ansible-zookeeper

Some quick explanations.

  • “hosts: all” – just means apply the following playbook to all the hosts Ansible knows about. In our case it will just be our VirtualBox vm, but you could specify hostnames, groupnames etc for Ansible to apply to.
  • “become: true” – basically means run as root/sudo. For more details see become documentation.
  • “roles” – the roles to apply to our vm. We will be covering these next.

Now if you run vagrant up at this point, unless you have installed the zookeeper role, it should fail with a message like …

ERROR! the role 'AnsibleShipyard.ansible-zookeeper' was not found in /home/andy/blogs/myKafkaVm/provisioning/roles:/etc/ansible/roles:/home/andy/blogs/myKafkaVm/provisioning

… which obviously refers to the zookeeper role in our playbook. In the above configuration we are saying to use a locally installed Ansible role called AnsibleShipyard.ansible-zookeeper. There isn’t one, so lets install it.

We could of course write our own Ansible roles to do precisely what we want, but if a problem has already been solved then let us not waste the world’s time eh!? We are going to install one from Ansible Galaxy which is a collection of community roles, somewhat like Ruby gems, Python’s PyPI. Anyway, the “role” that we are interested in using here is ansible-zookeeper contributed by AnsibleShipyard.

sudo ansible-galaxy install AnsibleShipyard.ansible-zookeeper

This basically stores the role under /etc/ansible/roles/

and now vagrant up should work again.

And now we’re getting somewhere. You can ssh onto the box and check zookeeper is installed at /opt/zookeeper-{version}. Or ps -ef | grep java and it should be running.

Install Kafka using a downloaded Ansible Role

To install Kafka on the vm, we are going to use another role, but not one from Ansible Galaxy. In this case we are going to download it from git and use it more directly.

Add the “ansible-kafka” role to the playbook.yml so that it looks like the following …

---
- hosts: all
  become: true
  roles:
    - role: AnsibleShipyard.ansible-zookeeper
    - {
        role: "ansible-kafka",
        kafka_hosts: ["0.0.0.0"],
        kafka_zookeeper_hosts: ["0.0.0.0"],
        kafka_version: 0.10.1.1,
        kafka_scala_serverion: 2.11
    }

Now this will fail for the same reason as before if we do vagrant up

touch provisioning/requirements.yml

And add the following line to the requirements.yml file …

- src: https://github.com/jaytaylor/ansible-kafka

and then run

sudo ansible-galaxy install -r provisioning/requirements.yml --ignore-errors

There appears a slight issue with the port check on this – so lets just deactivate this check for now.

sudo vi /etc/ansible/roles/ansible-kafka/tasks/kafka-cfg.yml

and then add an “ignore_errors: yes” to the task name “Check kafka port test result” such that it looks like …

- name: "Check kafka port test result"
  ignore_errors: yes
  fail: msg="Kafka port not open on host={{ inventory_hostname }}, port={{ server.port }}"
  when: healthcheck.elapsed >= kafka_port_test_timeout_seconds and kafka_port_test_timeout_seconds > 0
  tags:
    - kafka-cfg
    - kafka-healthcheck

and now vagrant up not only works, but when you ssh onto it Kafka is running

ps -ef | grep kafka

… should show this to be the case. If not running check /usr/local/kafka/logs/* for any errors – perhaps memory allocation on the VM for example.

Pat on the back time 🙂 … but only a small one cos we’re not done yet.

Ensure topics are created on VM creation

Last thing we want to do is ensure that the topics our ninja micro-services require are already created for us. So lets add a couple of tasks to create 2 topics for testing.

Change your playbook.yml to look like this …

---
- hosts: all
  become: true
  roles:
    - role: AnsibleShipyard.ansible-zookeeper
    - {
        role: "ansible-kafka",
        kafka_hosts: ["0.0.0.0"],
        kafka_zookeeper_hosts: ["0.0.0.0"],
        kafka_version: 0.10.1.1,
        kafka_scala_serverion: 2.11
    }
  tasks: 
    - name: Create 'test-one' topic
      shell: /usr/local/kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test-one

    - name: Create 'test-two' topic
      shell: /usr/local/kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test-two

then …

vagrant destroy -f; vagrant up

And we now have one decently configured (don’t use in production) Apache kafka instance! Well done 🙂

Test Our Little Creation

Here’s a few things we can do to test our new env, taken from Apache Kafka Quickstart

$ vagrant ssh
$ cd /usr/local/kafka

Let’s see if our two topics were created …

$ bin/kafka-topics.sh --list --zookeeper localhost:2181
test-one
test-two

Let’s send a couple of messages to the test-one topic …

$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test-one
Howdy doodly doo! Talkie's the name, toasting's the game.
Would anyone like any toast?
[CTRL+D]

And finally let’s check that the messages were sent okay to that topic …

$ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test-one --from-beginning
Howdy doodly doo! Talkie's the name, toasting's the game.
Would anyone like any toast?

^CProcessed a total of 2 messages

Awesome!

What’s Next?

Next in the series we’ll use the VM we’ve just created to build a reusable image in a Google Cloud project. This will take us a decent step forward on the Infrastructure As Code dream journey!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Website Powered by WordPress.com.

Up ↑

%d bloggers like this: