- Lab
- A Cloud Guru
Exploring AML Designer Transforms: Join Data
A large amount of time for machine learning tasks is spent understanding the data and getting it into the proper configuration to train the model. This is the Data Wrangling, Exploration, and Cleaning phase of the machine learning lifecycle. In Azure Machine Learning designer, many common data changing operations are provided as transform modules. In this lab, you will explore the Join Data module to gain a deeper understanding of the tools at your disposal.
Path Info
Table of Contents
-
Challenge
Set Up the Workspace
Log in and go to the Azure Machine Learning Studio workspace provided in the lab.
Create a Training Cluster of
D2
instances.Create a new blank Pipeline in the Azure Machine Learning Studio Designer.
-
Challenge
Explore Join Data
Add IMDB Movie Titles and Movie Ratings dataset nodes to the canvas. Visualize these datasets to see if they have any common data. Note, these columns might not be named exactly the same in both datasets.
Using a Join Data transform node, combine the datasets on their shared data. This will be an Inner Join operation. Remove one of the columns containing the shared data.
Submit the Pipeline to perform the transformation.
-
Challenge
Visualize the Transformed Data
When the pipeline finishes, inspect the output of the Join Data node. Can you tell what movie is being reviewed now? Was the duplicate ID column removed successfully?
What's a lab?
Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.
Provided environment for hands-on practice
We will provide the credentials and environment necessary for you to practice right within your browser.
Guided walkthrough
Follow along with the author’s guided walkthrough and build something new in your provided environment!
Did you know?
On average, you retain 75% more of your learning if you get time for practice.