Summary

What can you do with a Raspberry Pi? Well, what if I told you that your budget computer could be a high-performance, fault tolerant, stream processing, messaging system that you can leverage while learning to develop distributed applications? Good news, it’s true! The latest generation of Raspberry Pi boasts a massive 4GB of RAM; because of this, the range of applicable projects has been greatly increased. Historically, message queues and topics have been notorious for hogging memory on their host system. Seems like Raspberry Pi is begging to be used as an Apache Kafka host – so let’s get started!

What you need

Install and setup Kafka on Raspberry Pi

Many of the steps in this guide will follow the Kafka Quickstart Guide, however, we will be setting up our RPi4 with multiple instances (for replication) and also changing the default configuration so that our Kafka logs are persisted between restarts. The default configuration will cause all of your Kafka data to be erased when your device restarts.

Download Apache Kafka from a trusted source, this link will take you directly to the Apache Kafka site.

Untar/Unzip the downloaded tar file. You can use terminal or extract the files using the desktop if you’re using remote desktop with VNC.

tar -xzf kafka_2.12-2.3.0.tgz
cd kafka_2.12-2.3.0
Extract Kafka to local directory

Start Zookeeper and Kafka

Start Zookeeper using the zookeeper configuration that was provided with the Kafka files. This approach is not recommended for production environments, but, will suffice for our development needs.

cd <location of kafka installation>
bin/zookeeper-server-start.sh config/zookeeper.properties

The default path for Zookeeper is /tmp/zookeeper and it runs on port 2181.

Start Kafka instances. We’re going to create two brokers so that we can replicate our data in Kafka logs. In order to start these brokers we need two separate server.properties files.

Next, create another copy of the existing server.properties file in the same directory and name it “server-2.properties”. Next, open both of the server properties files and make a few changes.

  • broker.id should be updated in server.properties from 0 to 1 and updated in server-2.properties from 0 to 2
  • listeners should be uncommented and updated to PLAINTEXT://:9092 and PLAINTEXT://:9093 respectively
  • log.dirs needs to be updated for each broker /tmp/kafka/kafka-logs-1 and /tmp/kafka/kafka-logs-2
server-2.properties example

Now, start both servers/brokers. I prefer to use multiple tabs in the terminal.

bin/kafka-server-start.sh config/server.properties
bin/kafka-server-start.sh config/server-2.properties

At this time there are a total of three terminal tabs/instances open: Zookeeper, Kafka broker 1, Kafka broker 2.

Zookeeper, Broker1, Broker2

Kafka topic, producer and consumer

Create a new replicated topic by opening another terminal tab and running the following script. This creates a new topic that has a replication-factor of 2, since there are two brokers.

bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 2 --partitions 1 --topic demo-topic

Produce some messages into your newly created topic. Do this by running the provided Kafka producer script that targets the new ‘demo-topic’.

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic demo-topic
...
> Some message
> Another message

Consume messages and print them to the terminal by running the provided Kafka consumer script targeting the ‘demo-topic’.

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --from-beginning --topic demo-topic
...
Some message
Another message
^C

WOO! You just setup Kafka on your Raspberry Pi 4, but, what now? As mentioned earlier, the default zookeeper.properties, server.properties, and server-2.properties are placing all of their data inside of the /tmp directory…this means that when your RPi4 is restarted all of the data is lost! Let’s fix that in the next section.

Configure data persistence on your Raspberry Pi

Stop Zookeeper and Kafka brokers by pressing ^C.

Delete files in /tmp or restart your Raspberry Pi.

Create new directories for Zookeeper and Kafka in /var.

cd /var
sudo mkdir zookeeper
sudo chmod 777 -R zookeeper
...
sudo mkdir kafka
sudo chmod 777 -R kafka

Generally, it’s not a good idea to set 777 permissions on a file, however, since it’s in /var we must set these permissions while starting zookeeper and Kafka from our user terminal.

Re-configure property files so that they utilize the new directories that are setup in /var.

  • server.properties
    • log.dirs=/var/kafka/kafka-logs-1
  • server-2.properties
    • log.dirs=/var/kafka/kafka-logs-2
  • zookeeper.properties
    • dataDir=/var/zookeeper

Now that you’ve reconfigured the property files, it’s time to repeat the steps in the first part of this guide:

  • Start Zookeeper
  • Start both Kafka Brokers
  • Create replicated topic
  • Produce some messages
  • Consume some messages

Finally, let’s see if this was worth it. Stop Zookeeper and Kafka, then, restart your Raspberry Pi. Now, when you log back in to your Raspberry Pi and complete the following steps you will see the messages you produced before restarting:

  • Start Zookeeper
  • Start both Kafka Brokers
  • Consume messages

Now that your Raspberry Pi is running Kafka, it becomes very easy to integrate into your code projects by using the Apache Kafka Streams API.