Back up Messages to an S3 Bucket in Kafka

1.75 hours
  • 4 Learning Objectives

About this Hands-on Lab

Kafka is known for replicating data amongst brokers to account for broker failure. Sometimes this isn’t enough — or perhaps you are sharing this data with a third-party application. Backing up your messages to an alternate source can save a lot of time and energy when trying to configure access to a cluster or deliver that data quickly across the world. In this hands-on lab, we will send topic messages to Amazon S3 using Kafka Connect. By the end of the lab, you will know how to use Kafka commands for sending data to an outside data repository. (**Note:** No previous AWS knowledge is required.)

Learning Objectives

Successfully complete this lab by achieving the following learning objectives:

Connect and Start the Kafka Cluster

Note: If you have trouble connecting via ssh, please give the lab an extra few minutes to finish setting up.

  1. In the bastion host, start a container and open a shell to that container.
  2. Change to the /tmp directory.
  3. Start up the Kafka cluster.
Create a New S3 Bucket
  1. Update the system.
  2. Install the awscli tool.
  3. Configure access to AWS by creating a key (note that our cloud_user access and secret access keys are on the hands-on lab page)
  4. Create a new bucket in the us-east-1 region (making sure the name is globally unique, without any uppercase letters or underscores).
  5. Install Vim.
  6. Open the properties file.
  7. Change the region to us-east-1.
  8. Add the new bucket name.
Start a Producer to a New Topic Named `s3_topic` and Write at Least Nine Messages
  1. Open an Avro console producer to the topic, and include a schema.

  2. Type the nine messages following the defined schema:

    {"f1": "value1"}
    {"f1": "value2"}
    {"f1": "value3"}
    {"f1": "value4"}
    {"f1": "value5"}
    {"f1": "value6"}
    {"f1": "value7"}
    {"f1": "value8"}
    {"f1": "value9"}
  3. Press Ctrl + C to exit the session.

Start the Connector and Verify the Messages Are in the S3 Bucket
  1. Start the connector and load the configuration.
    • We’ll then see some JSON output, including our bucket name.
  2. Copy the bucket name
  3. List the bucket’s objects, using the bucket name you copied in the command.

Additional Resources

In this hands-on lab, we will send topic messages to Amazon S3 using Kafka Connect. First, we'll create an S3 bucket, using some provided commands. Then, we'll produce to a new topic and use the Connect plugin to verify the data got copied to the newly created S3 bucket.

By the end of the lab, you will know how to use Kafka commands for sending data to an outside data repository.

We will meet the following requirements:

  • Create the S3 bucket in the us-east-1 region and give it a unique name (without any uppercase letters or underscores).
  • Use the AWS credentials given with this hands-on lab.
  • Start the Kafka cluster using the confluent start command.
  • Create the topic name s3_topic and include a simple schema in Kafka.
  • Import at least nine records into the Kafka cluster.
  • Verify the records exist in the S3 bucket by listing the contents.

Note: If you have trouble connecting via SSH, please give the lab an extra few minutes to finish setting up.

What are Hands-on Labs

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Sign In
Welcome Back!

Psst…this one if you’ve been moved to ACG!

Get Started
Who’s going to be learning?