In this hands-on lab, you work as a Machine Learning Operations Engineer for Avendador. During the lab, you will create and execute a machine learning pipeline using sample Microsoft data sources (**Restaurant Ratings** and **Restaurant Feature Data**).
Learning Objectives
Successfully complete this lab by achieving the following learning objectives:
- Preparing the Environment
- Create a
Standard_D2_V2
compute instance and start it. - Create an Azure Machine Learning pipeline in Azure Machine Learning studio.
- Create a
- Ingest Data and Select Key Columns
- Add the sample datasets, Restaurant Ratings and Restaurant Feature Data, to the pipeline canvas.
- Select
placeID
andrating
from the Restaurant Ratings data source. - Select
placeID
,alcohol
,dress_code
,price
, andRambience
from the Restaurant Feature Data source.
- Transform Data Sources
- Join the data sources using
placeID
as key. - Replace missing data in columns (
placeID
,rating
,alcohol
,dress_code
,price
,Rambience
) with 0.
- Join the data sources using
- Split Data into Training and Test Data
- Split data using a 60/40 split.
- 60% should go to a filter using Pearson correlation
- 40% should be used as test
- Create a Pearson correlation feature selection using
rating
as a target column (select columns to transform and apply transformation). - Create a Boosted Decision Tree Regression with the following settings:
- Create trainer mode: SingleParameter
- Maximum number of leaves per tree: 20
- Minimum number of leaves per tree: 10
- Learning rate: 0.2
- Total number of trees constructed: 100
- Create Train Model using
rating
as label column.
- Split data using a 60/40 split.
- Score and Evaluate
- Create Score Model activity.
- Create Evaluate Model activity.
- Submit Model.
- Evaluate Results.