Replicating Data Between Two Kafka Clusters

1.75 hours
  • 5 Learning Objectives

About this Hands-on Lab

Kafka can be deployed in mutliple data centers in an “Active-Active”, “Active-Passive”, or centralized architecture. In this hands-on lab, we simulate an active-passive architecture in which data from one cluster is replicated to another cluster using a tool called Replicator. Replicator, like MirrorMaker, allows you to preserve topic configuration in the source cluster while replicating the messages from one cluster to another.

Learning Objectives

Successfully complete this lab by achieving the following learning objectives:

Start the Destination Cluster

Start Zookeeper.

bin/zookeeper-server-start etc/kafka/zookeeper.properties

Start Kafka.

bin/kafka-server-start etc/kafka/server.properties
Start the Origin Cluster

Make a copy of the configuration files.

cp etc/kafka/zookeeper.properties /tmp/zookeeper_origin.properties
cp etc/kafka/server.properties /tmp/server_origin.properties

Change the port numbers for the origin cluster.

sed -i -e "s/2181/2171/g" /tmp/zookeeper_origin.properties
sed -i -e "s/9092/9082/g" /tmp/server_origin.properties
sed -i -e "s/2181/2171/g" /tmp/server_origin.properties
sed -i -e "s/#listen/listen/g" /tmp/server_origin.properties

Change the data directory for the origin cluster.

sed -i -e "s/zookeeper/zookeeper_origin/g" /tmp/zookeeper_origin.properties
sed -i -e "s/kafka-logs/kafka-logs-origin/g" /tmp/server_origin.properties

Start Zookeeper.

bin/zookeeper-server-start /tmp/zookeeper_origin.properties

Start Kafka.

bin/kafka-server-start /tmp/server_origin.properties
Create a Topic in the Source Cluster

Create a topic named test-topic with 1 partition and a replication factor of 1.

bin/kafka-topics --create --topic test-topic --replication-factor 1 --partitions 1 --zookeeper localhost:2171
Run the Replicator Tool

Run Kafka Connect in Standalone Mode and pass in the quickstart-replicator.properties file

bin/connect-standalone etc/kafka/connect-standalone.properties etc/kafka-connect-replicator/quickstart-replicator.properties
Produce and Verify Replication

Verify that the topic was replicated to the destination cluster.

bin/kafka-topics --describe --topic test-topic.replica --zookeeper localhost:2181

Open a Console Producer and write to the topic test-topic in the source cluster.

seq 10000 | bin/kafka-console-producer --topic test-topic --broker-list localhost:9082

Confirm the messages were replicated by opening a console consumer to the destination cluster.

bin/kafka-console-consumer --from-beginning --topic test-topic.replica --bootstrap-server localhost:9092

Additional Resources

In this hands-on lab, we need to create two localized Kafka clusters. These clusters will contain only one zookeeper and kafka instance each. We differentiate between the two by specifying different port numbers and data directories for each cluster. Once both clusters are up and running, we create a topic and replicate that topic to the secondary cluster. We continue to produce messages to the source cluster and ensure that the messages are successfully mirrored.

Note: you will be opening several SSH sessions in this lab, it is recommended to do this using your own SSH clients and not the web browser.

Use the confluent Kafka binaries for each task located here: https://packages.confluent.io/archive/5.2/confluent-5.2.1-2.12.tar.gz

Use the following requirements to set up your cluster and create your topic:

  • Your destination cluster must be reachable from Kafka port 9092 and Zookeeper port 2181.
  • Your source cluster must be reachable from Kafka port 9082 and Zookeeper port 2171.
  • Your Zookeeper configuration file must be named zookeeper_origin.properties and specify the data directory /tmp/zookeeper_origin.
  • Your Kafka configuration file must be named server_origin.properties and specify the data directory /tmp/kafka-logs-origin.
  • Create a topic named test-topic on the source cluster with 1 partition and a replication factor of 1.
  • Run the replicator tool with Kafka Connect (Standalone Mode) using the quickstart-replicator.properties configuration file.
  • Verify the topic was replicated to the destination cluster.
  • Open a console producer and produce some messages to the topic test-topic.
  • Verify the messages were replicated to the destination cluster by running a console consumer.

What are Hands-on Labs

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Sign In
Welcome Back!

Psst…this one if you’ve been moved to ACG!

Get Started
Who’s going to be learning?