Lab
A Cloud Guru

Using Data Pipeline to Export DynamoDB Data to S3

In this lab, we are going to use Data Pipeline to copy DynamoDB data to an S3 bucket as a backup. We'll learn different ways that this can be done with DynamoDB and Data Pipeline to create backups of the DynamoDB data. **Note that this lab has been updated to reflect changes in AWS, the latest steps can be found in the lab guide. m4.large instance size must be used for the master and core instances.**

Try for free Contact sales

Path Info

Level

Intermediate

Duration

30m

Published

Nov 06, 2020

Challenge

Copy Subnet ID and S3 Bucket Name

Before we can create a data pipeline, we'll need the name of the S3 bucket that we are going to output data to as well as a Subnet ID so pipeline will know where to launch the EMR cluster to handle executing the export.

To get the S3 bucket name, navigate to S3 in the AWS console and locate the provided S3 bucket. Copy the bucket name from the console (it should start with cfst-) and save this for the next objective.

Next, find the Subnet ID by navigating to the VPC console and find the subnet with an internet gateway attached to its route table that has a CIDR range of 10.0.0.0/24. Copy the Subnet ID and save this for the next objective.
Challenge

Create Data Pipeline
Navigate to the Data Pipeline console and create a new pipeline to export our data from the LinuxAcademy DynamoDB table to our S3 bucket. Make the below settings are set:
1. The pipeline should be called backupdbtable.
2. In the Build Using a Template field, use the Export DynamoDB Table to S3 template.
3. The source table will be the LinuxAcademy DynamoDB Table that is already created.
4. Logging should be set to the cfst- bucket from the first objective.
5. The pipeline should be set to run on pipeline activation.
You should also set the following Architecture setting for the pipeline:
1. Add in the Subnet ID parameter and use the Subnet ID from the first objective.
2. Update the Core Instance Type to m4.large
3. Update the Master Instance Type to m4.large
4. Set the Resize Cluster Before Running parameter to false.
Once the parameters have been set, save and activate the pipeline to begin the export execution.

Author

A Cloud Guru

The Cloud Content team comprises subject matter experts hyper focused on services offered by the leading cloud vendors (AWS, GCP, and Azure), as well as cloud-related technologies such as Linux and DevOps. The team is thrilled to share their knowledge to help you build modern tech solutions from the ground up, secure and optimize your environments, and so much more!

What's a lab?

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.