Kafka Streams is a library enabling you to perform per-event processing of records. You can use it to process data as soon as it arrives, versus having to wait for a batch to occur. In this hands-on lab, we use Kafka Streams to stream some input data as plain-text and process it in real-time. With this, we can count the number of words from our input stream.
Learning Objectives
Successfully complete this lab by achieving the following learning objectives:
- Create the Input and Output Topic
Create the input topic named
streams-plaintext-input
.bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic streams-plaintext-input
Create the output topic named
streams-wordcount-output
.bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic streams-wordcount-output
- Open a Kafka Console Producer
Open a Kafka console producer.
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic streams-plaintext-input
Type the three messages from the instructions.
>kafka streams is great >kafka processes messages in real time >kafka helps real information streams
- Open a Kafka Console Consumer
Open a Kafka console consumer using the default message formatter and the three properties given in the instructions.
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic streams-wordcount-output --from-beginning --formatter kafka.tools.DefaultMessageFormatter --property print.key=true --property print.value=true --property key.deserializer=org.apache.kafka.common.serialization.StringDeserializer --property value.deserializer=org.apache.kafka.common.serialization.LongDeserializer
- Run the Kafka Streams Application
Use the
kafka-run-class.sh
command to run theWordCountDemo
application.bin/kafka-run-class.sh org.apache.kafka.streams.examples.wordcount.WordCountDemo