Skip to content

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.
  • Labs icon Lab
  • A Cloud Guru
Azure icon
Labs

Exploring AML Designer Transforms: Edit Metadata and Convert to Indicator Values

A large amount of time for machine learning tasks is spent understanding the data and getting it into the proper configuration to train the model. This is the Data Wrangling, Exploration, and Cleaning phase of the machine learning lifecycle. In Azure Machine Learning Designer, many common data changing operations are provided as transform modules. In this lab, you will explore the `Edit Metadata` and `Convert to Indicator Values` modules to gain a deeper understanding of the tools at your disposal.

Azure icon
Labs

Path Info

Level
Clock icon Advanced
Duration
Clock icon 30m
Published
Clock icon Sep 24, 2020

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.

Table of Contents

  1. Challenge

    Set Up the Workspace

    Log in and go to the Azure Machine Learning Studio workspace provided in the lab.

    Create a Training Cluster of Standard_D2_v2 instances.

    Create a new blank Pipeline in the Azure Machine Learning Studio Designer.

  2. Challenge

    Explore Convert to Indicator Values

    Add an Automobile price data (Raw) dataset node to the canvas. Visualize the data and find a text column with a small number of unique values. Many features in this dataset are categorical.

    Use an Edit Metadata transform node to mark the column as Categorical.

    Use a Convert to Indicator Values transform node to one-hot encode the column now marked as Categorical. Remove the original column to avoid passing the same data to the model in multiple ways. You will need to manually input the column name since you have not run the pipeline yet.

  3. Challenge

    Explore Edit Metadata

    The Convert to Indicator Values node will create new columns. These are features, and should be marked as such. Use another Edit Metadata node to change all columns to features.

    Two models need to be created from this data, one that predicts price, and one that predicts city-mpg. Use Edit Metadata nodes to mark price as the label in one dataset and city-mpg as the label in another.

    Submit the Pipeline to perform all of the transformations.

  4. Challenge

    Visualize the Transformed Data

    When the pipeline finishes, inspect the output of the Convert to Indicator Value node. Has the data changed? Has the metadata changed?

    Inspect each Edit Metadata node used to set a column as a label. Has the actual data changed? Has the metadata changed?

    You only changed one categorical field, but this process can be repeated to further improve the dataset. This preprocessing also only had to be done once, even though you ended with two datasets to train models.

The Cloud Content team comprises subject matter experts hyper focused on services offered by the leading cloud vendors (AWS, GCP, and Azure), as well as cloud-related technologies such as Linux and DevOps. The team is thrilled to share their knowledge to help you build modern tech solutions from the ground up, secure and optimize your environments, and so much more!

What's a lab?

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Provided environment for hands-on practice

We will provide the credentials and environment necessary for you to practice right within your browser.

Guided walkthrough

Follow along with the author’s guided walkthrough and build something new in your provided environment!

Did you know?

On average, you retain 75% more of your learning if you get time for practice.

Start learning by doing today

View Plans