Lab
A Cloud Guru

Manually Migrating Data Between Redshift Clusters

In this lab, we'll utilize the Redshift `UNLOAD` and `COPY` commands to migrate data between an existing Redshift cluster, which we will launch in the course of completing the lab. Note: Please use the written guide for this lab as the AWS UI has changed.

Try for free Contact sales

Path Info

Level

Advanced

Duration

45m

Published

Apr 21, 2021

Challenge

Investigate the Lab Environment

Log in to the provided AWS Account and inspect the provided resources.

You should find a Redshift cluster with the ID users-cluster as well as an empty S3 bucket with a name that begins with users-data-. Additionally, there is an IAM role attached to users-cluster which will be used when launching a second Redshift cluster with the ID RedshiftS3.

Note: The ARN of this role is provided in the lab interface, as is the name of the S3 bucket for easy reference.
Challenge

Launch the Target Redshift Cluster

In the Redshift web console, launch a new cluster with the same configuration as users-cluster, using the ID users-cluster-2.

Choose the dc2.large node type and set the number of nodes to 1.

Challenge

Copy the Existing Redshift Table to S3

Utilizing either the Redshift web console query editor or the client of your choosing, use the UNLOAD command to copy the existing test table to the provided S3 bucket in parquet format.

UNLOAD ('select * from users_data')
TO '<users-data-bucket>'
IAM_ROLE '<RedshiftS3 ARN>'
FORMAT AS PARQUET;

create table users_data(
  id_value varchar(64),
  name_first varchar(64),
  name_last varchar(64),
  location_country varchar(32),
  dob_age int,
  picture_large varchar(64),
  primary key(id_value)
)
distkey(location_country)
compound sortkey(id_value);

Challenge

Copy Data from S3 to the Newly Launched Redshift Cluster
Utilizing either the Redshift web console query editor or the client of your choosing, create a new table in users-cluster-2 which matches the table provided on users-cluster. Once this table is created, use the COPY command to load the data which has been backed up in S3 to your new table.

Note: Table definition information is provided in the lab instructions.
```
COPY users_data    
FROM '<users-data-bucket>'
IAM_ROLE '<RedshiftS3 ARN>'
FORMAT AS PARQUET;
```
Challenge

Check Your Data
Once the above objectives are fulfilled, you should be able to run the following query from both clusters and receive an identical response:
```
select * from users_data limit 10;
```

Author

A Cloud Guru

The Cloud Content team comprises subject matter experts hyper focused on services offered by the leading cloud vendors (AWS, GCP, and Azure), as well as cloud-related technologies such as Linux and DevOps. The team is thrilled to share their knowledge to help you build modern tech solutions from the ground up, secure and optimize your environments, and so much more!

What's a lab?

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.