Kafka is known for replicating data amongst brokers to account for broker failure. Sometimes this isn’t enough — or perhaps you are sharing this data with a third-party application. Backing up your messages to an alternate source can save a lot of time and energy when trying to configure access to a cluster or deliver that data quickly across the world. In this hands-on lab, we will send topic messages to Amazon S3 using Kafka Connect. By the end of the lab, you will know how to use Kafka commands for sending data to an outside data repository. (**Note:** No previous AWS knowledge is required.)
Learning Objectives
Successfully complete this lab by achieving the following learning objectives:
- Connect and Start the Kafka Cluster
Note: If you have trouble connecting via ssh, please give the lab an extra few minutes to finish setting up.
- In the bastion host, start a container and open a shell to that container.
- Change to the
/tmp
directory. - Start up the Kafka cluster.
- Create a New S3 Bucket
- Update the system.
- Install the
awscli
tool. - Configure access to AWS by creating a key (note that our
cloud_user
access and secret access keys are on the hands-on lab page) - Create a new bucket in the
us-east-1
region (making sure the name is globally unique, without any uppercase letters or underscores). - Install Vim.
- Open the
properties
file. - Change the
region
tous-east-1
. - Add the new bucket name.
- Start a Producer to a New Topic Named `s3_topic` and Write at Least Nine Messages
Open an Avro console producer to the topic, and include a schema.
Type the nine messages following the defined schema:
{"f1": "value1"} {"f1": "value2"} {"f1": "value3"} {"f1": "value4"} {"f1": "value5"} {"f1": "value6"} {"f1": "value7"} {"f1": "value8"} {"f1": "value9"}
Press Ctrl + C to exit the session.
- Start the Connector and Verify the Messages Are in the S3 Bucket
- Start the connector and load the configuration.
- We’ll then see some JSON output, including our bucket name.
- Copy the bucket name
- List the bucket’s objects, using the bucket name you copied in the command.
- Start the connector and load the configuration.