Skip to content

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.
  • Labs icon Lab
  • A Cloud Guru
Azure icon
Labs

Developing a Pipeline in Azure Data Factory

Azure Data Factory is a core service for any Azure cloud project. It is an orchestration service responsible for the movement and automation of data into and throughout the Azure cloud. In this lab, we will learn how to connect data sources and create a data pipeline that will move data in Azure. In this scenario, you work for a company selling various items online all around the United States. You have been asked to research a way to combine sales data with other data sources, such as parking lot information to highlight potential new brick-and-mortar stores. In order to do this, you are going to provide a presentation on leveraging cloud services (specifically Data Factory) to automate this research. https://docs.microsoft.com/en-us/learn/modules/data-integration-azure-data-factory/ https://docs.microsoft.com/en-us/learn/modules/orchestrate-data-movement-transformation-azure-data-factory/ https://docs.microsoft.com/en-us/learn/modules/receive-data-with-azure-data-share-transforming-with-azure-data-factory/ https://docs.microsoft.com/en-us/learn/modules/create-production-workloads-azure-databricks-azure-data-factory/

Azure icon
Labs

Path Info

Level
Clock icon Beginner
Duration
Clock icon 1h 0m
Published
Clock icon Jan 25, 2021

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.

Table of Contents

  1. Challenge

    Prepare the Environment

    Provision the following:

    • A Data Factory instance
      • West US 2 region, V2, configure Git later
    • SQL databases
      • Create a server in East US
      • Ensure it's set to 5 (Basic) DTUs at 2 GB
    • A storage account
      • West US 2, Standard Performance, Storage V2, RA-GRS
      • Create 2 containers (raw, curated)
  2. Challenge

    Create and Connect Datasets

    Connect the SQL database (SalesLT.Address table) and storage account from the previous step. Then, download a CSV file and add it to the blob storage account for later use.

  3. Challenge

    Create the Copy Steps of Our Pipeline

    Create a copy step to pull the SalesLT.Address data from the SQL database that will output to the storage account blob container. Then, validate and execute this pipeline.

  4. Challenge

    Use Data Flow to Combine Data from our Copied File and CSV File

    Use Data Flow to add data from the newly copied file to the CSV file. This new data will then be output to a new blob file in the curated container.

  5. Challenge

    Publish the Pipeline

    Publish the completed pipeline.

The Cloud Content team comprises subject matter experts hyper focused on services offered by the leading cloud vendors (AWS, GCP, and Azure), as well as cloud-related technologies such as Linux and DevOps. The team is thrilled to share their knowledge to help you build modern tech solutions from the ground up, secure and optimize your environments, and so much more!

What's a lab?

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Provided environment for hands-on practice

We will provide the credentials and environment necessary for you to practice right within your browser.

Guided walkthrough

Follow along with the author’s guided walkthrough and build something new in your provided environment!

Did you know?

On average, you retain 75% more of your learning if you get time for practice.

Start learning by doing today

View Plans