KSQL provides a powerful and flexible interface for Kafka’s stream processing features. With KSQL, you can even build data processing pipelines without needing to write your own Kafka Streams applications. In this lab, we will solve a simple data processing use case using KSQL. We will create a stream from an existing topic, and we will output the data in a processed form to an output topic using a persistent streaming query.
Learning Objectives
Successfully complete this lab by achieving the following learning objectives:
- Create a Stream to Pull Data in from the Topic
Start a KSQL session:
sudo ksql
Set
auto.offset.reset
toearliest
:SET 'auto.offset.reset' = 'earliest';
Look at the data in the
member_signups
topic:PRINT 'member_signups' FROM BEGINNING;
Create a stream from the topic:
CREATE STREAM member_signups (firstname VARCHAR, lastname VARCHAR, email_notifications BOOLEAN) WITH (KAFKA_TOPIC='member_signups', VALUE_FORMAT='DELIMITED');
- Create a Persistent Streaming Query to Write Data to the Output Topic in Real Time
Create the persistent streaming query:
CREATE STREAM member_signups_email AS SELECT * FROM member_signups WHERE email_notifications=true;
View the data in the output topic to verify that everything is working:
PRINT 'MEMBER_SIGNUPS_EMAIL' FROM BEGINNING;