Tag

how to install kafka

Browsing

Hope you are clear about What is Kafka? through my previous post. Continue, we will go through how we can install Apache Kafka on Windows and how to create topic, producer, consumer in Kafka.

Step 1: Download JDK, if your PC has installed JDK, you can skip this step.

You can download JDK at https://www.oracle.com/java/technologies/javase-downloads.html. To check JDK installed successfully you type java -version in terminal like this

Step2: Download and install Apache Kafka binary

Download Apache Kafka at https://kafka.apache.org/downloads and remember choose binary version, don’t choose source version.

Extract Apache Kafka to C:/kafka folder or any folder that you like, then we proceed to configure Kafka log folder and ZooKeeper folder.

Create data folder in C:/kafka then create kafka folder contains kafka server logs and zookeeper folder contains log data in this folder.

Open config/server.properties and change log.dirs to the previous created kafka folder.

log.dirs=C:/kafka/data/kafka

Open config/zookeeper.properties and change dataDir to the previous created data folder

Step 3: Add Kafka bin folder to your Path environment variable.

Add C:\kafka\bin\windows to your Path enviroment variable

Step 4: Start ZooKeeper and Apache Kafka

Zookeeper keeps track of status of the Kafka cluster nodes and it also keeps track of Kafka topics, partitions etc. Zookeeper it self is allowing multiple clients to perform simultaneous reads and writes and acts as a shared configuration service within the system. The Zookeeper atomic broadcast (ZAB) protocol is the brains of the whole system, making it possible for Zookeeper to act as an atomic broadcast system and issue orderly updates.

Start ZooKeeper

Start ZooKeeper by change your directory to C:\kafka\config and execute the following command

cd C:\kafka\config
zookeeper-server-start zookeeper.properties

And make sure ZooKeeper started successfully (yellow highlight in the following figure). ZooKeeper’ll listen on 2181 port.

Start Apache Kafka

Open new command prompt, then navigate to config folder and Start Apache Kafka by the following command

kafka-server-start server.properties

After started successfully, Kafka will create a broker and listen on port 9092.

Step 5: Create Kafka topic

Now  we will create a topic with the name test and a replication factor of 1 because we have only one Kafka server running.

What is Kafka replication factor?

A replication factor is the number of copies of data over multiple brokers. The replication factor value should be greater than 1 always (between 2 or 3). This helps to store a replica of the data in another broker from where the user can access it.

For example, suppose we have a cluster containing three brokers: Broker 1, Broker 2, and Broker 3. A topic, namely Topic-X is split into Partition 0 and Partition 1 with a replication factor of 2 as the following figure.

Replication factor

If you have a cluster with more than one Kafka server running, you can increase the replication-factor accordingly, which will increase the data availability and act like a fault-tolerant system.

Open new command prompt and type the following command to create topic test with only have 1 partition

kafka-topics.bat --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

Step 6: Create Kafka producer

Open new command prompt and create Kafka producer subscribe test topic by the following command:

kafka-console-producer --broker-list localhost:9092 --topic test

Step 7: Create Kafka consumer

Next we will create a consumer subscribe topic test. When you create a consumer without specifying consumer group, Kafka will assign a default group for your consumer.

Open new commadn prompt and type the following command to create a consumer

kafka-console-consumer --bootstrap-server localhost:9092 --topic test

if you want to assign a group for your consumer, you need add –group <group_name> to the command

kafka-console-consumer --bootstrap-server localhost:9092 --topic test --group my-created-consumer-group

Now when you type any character on producer terminal, you will receive same message on consumer terminal.

Suppose we add 2 consumers to same group (test-group), at this time, if you send any message on producer, you will receive same message on only a consumer instead of load balancing because we only have a partition.

2 consumer in consumer gropup subsribe topic with 1 partition

Now we will create a topic with 2 partitions by the following command:

kafka-topics.bat --create --zookeeper localhost:2181 --replication-factor 1 --partitions 2 --topic 

You can review infomation of the created topic by the command

kafka-topics --zookeeper 127.0.0.1:2181 --topic test2partitions --describe
describe topic

Next we will create 2 consumers to subscribe test2partitions topic, 2 consumers will be assigned same group.

kafka-console-consumer --bootstrap-server localhost:9092 --topic test2partitions --group test-group
load balancing consumer-group

Some popular commands Kafka

To get messages from the beginning of the subscribed topic you need to add –from-beginning to the create consumer command like this

kafka-console-consumer --bootstrap-server localhost:9092 --test --from-beginning

Delete test topic from all brokers

kafka-run-class kafka.admin.TopicCommand --delete --test --zookeeper localhost:2181