Using Kinesis Data Firehose and Kinesis Data Analytics

1 hour
  • 4 Learning Objectives

About this Hands-on Lab

Easily ingesting data from numerous sources and making timely decisions is becoming a critical and core capability for many businesses. In this lab, we provide hands-on experience using Amazon Kinesis Data Firehose to capture, transform, and load data streams into Amazon S3 and perform near real-time analytics with Amazon Kinesis Data Analytics.

### Lab Prerequisites
– Understand how to log into and use the AWS Management Console.
– Understand AWS Elastic Compute Cloud (EC2) basics.
– Understand AWS Command Line Interface (CLI) basics.

Learning Objectives

Successfully complete this lab by achieving the following learning objectives:

Create a Kinesis Data Firehose Delivery Stream
  1. Log in to the AWS Management Console with the AWS Account information provided in the lab instructions.
  2. Navigate to the Kinesis service.
  3. Click Get started.
  4. Click Create Delivery Stream to create a Kinesis Data Firehose stream.
  5. Enter "captains-kfh" as the Delivery stream name.
  6. Click Next.
  7. Click Next.
  8. Select Amazon S3 as the destination.
  9. For the S3 Bucket, click Create New.
  10. Enter a globally unique bucket name, starting with "kfh-ml".
  11. Click Create S3 Bucket -> Click Next.
  12. Enter "1" MB as the Buffer size -> Enter "60" seconds as the Buffer interval.
  13. Click Create new or choose for the IAM role.
  14. Select the IAM Role provided in the lab.
  15. Select the FirehoseDeliveryRole as the Policy Name.
  16. Click Allow.
  17. Click Next .
  18. Click Create delivery stream.
Stream Data to the New Kinesis Data Firehose Delivery Stream
  1. Open an SSH connection to the EC2 instance named Kinesis Test Server using the credentials provided in your lab instructions.
  2. Run the following command.
    python write-to-kinesis-firehose-space-captains.py
  3. Return to the AWS Management Console and navigate to the S3 bucket created earlier.
  4. Refresh the view on the S3 bucket every 30 seconds and wait for records to appear in the S3 bucket. It may take 60 seconds for them to show.
  5. Copy the name of the S3 bucket to the clipboard.
  6. Return to the terminal.
  7. Stop the Python script by pressing CTRL+C.
  8. Copy the files from S3 to the server by running the following command.
  9. Verify the contents of one of the files.
Create a Kinesis Data Analytics Application
  1. Navigate back to the terminal. If the ssh session was terminated, log back in.
  2. Run the following command.
    python write-to-kinesis-firehose-space-captains.py
  3. Navigate to the Kinesis service in the AWS Management Console.
  4. Click Data Analytics on the left-side menu.
  5. Click Create application.
  6. Enter "popular-space-captains" as the Application name.
  7. Enter "popular-space-captains" as the Description.
  8. Ensure the SQL runtime is selected.
  9. Click Create application.
  10. Click Connect streaming data.
  11. Ensure Choose source is selected at the top.
  12. Select Kinesis Firehose delivery stream as the Source.
  13. Choose the captains-kfh created earlier as the Kinesis Firehose delivery stream.
  14. Click Choose from IAM roles that Kinesis Analytics can assume.
  15. Choose the IAM role created for this lab.
  16. Click Discover schema.
  17. Click Save and continue.
  18. Click Go to SQL editor.
  19. Click Yes, start application.
  20. Open the “Using Kinesis Data Firehose and Kinesis Data Analytics Lab” GitHub repo provided in the lab instructions.
  21. Copy the SQL code from the kinesis-analytics-popular-captains.sql file and paste it into the SQL editor.
  22. Click Save and run SQL.
  23. View the real-time analytics DESTINATION_CAPTAINS_SCORES results.
Create a Kinesis Data Analytics Anomaly Detection Application
  1. Open the “Using Kinesis Data Firehose and Kinesis Data Analytics Lab” GitHub repo provided in the lab instructions.
  2. Copy the SQL code from the kinesis-analytics-rating-anomaly.sql file and paste it into the SQL editor.
  3. Click Save and run SQL.
  4. View the real-time analytics DESTINATION_SQL_STREAM results.

Additional Resources

Our boss has asked us to load data into Amazon Simple Storage Service (S3). The data ingestion needs to be reliable and we need a solution with little to no ongoing administration. Additionally, the solution needs to automatically scale to meet capacity. Kinesis Data Firehose is perfect for this situation.

Additionally, we've been asked to analyze the data as it's streaming in, so we can get a sense of the values and identify data anomalies. Eventually, our team will build capabilities to respond to customer data in real time. Amazon Kinesis Data Analytics is perfect for this situation.

In this lab, we create a Kinesis Data Firehose stream, run a script to generate simulated data, and set up some basic analytical queries in Kinesis Data Analytics to look at the data.

Please log into the AWS Console by using the cloud_user credentials provided in the lab instructions.

Once inside the AWS account, make sure you are using us-east-1 (N. Virginia) as the selected region.

Resources for this lab:

Useful links related to this hands-on lab:

What are Hands-on Labs

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Get Started
Who’s going to be learning?

How many seats do you need?

  • $499 USD per seat per year
  • Billed Annually
  • Renews in 12 months

Ready to accelerate learning?

For over 25 licenses, a member of our sales team will walk you through a custom tailored solution for your business.


$2,495.00

Checkout
Sign In
Welcome Back!
Thanks for reaching out!

You’ll hear from us shortly. In the meantime, why not check out what our customers have to say about ACG?