Microsoft Certified: Azure Data Engineer Associate (DP-203)

By Brian Roehm

This course is designed to help you obtain the knowledge and skills required to pass the DP-203: Data Engineering on Microsoft Azure exam.

22 hours
  • 130 Lessons
  • 9 Hands-On Labs
  • 9 Course Quizzes
  • 1 Practice Exam

About the course

We live in a data-rich world where the size, complexity, and uses of data are growing exponentially every day. The modern data engineer must build solutions to move, transform, and consolidate both structured and unstructured data from a variety of sources. Azure provides all the necessary services to accomplish these tasks and build a robust data analytics solution. Obtaining the title of Microsoft Certified: Azure Data Engineer Associate signals to the world that you are versed in building such solutions.
In this course, we cover what you will need to know in order to be successful on the DP-203 certification exam. We start with an understanding of what data engineering is and how Azure meets the needs of this role, move on to covering data storage, data ingestion and transformation, building batch and stream processing, and how to secure, optimize, and troubleshoot your solutions. Microsoft’s exam objectives have been carefully woven into the course and can be easily found using lesson descriptions or the included mapping guide. Those objectives include:

  • Design and implement data storage
  • Design and develop data processing
  • Design and implement data security
  • Monitor and optimize data storage and data processing
  • Chapter 1 4 Lessons Introduction 14:52

    Course Introduction

    1:43

    About the Training Architect

    1:12

    About the Exam

    7:05

    Programming Background Needed

    4:52
  • Chapter 2 10 Lessons Data Engineering Crash Course 1:18:02

    Section Introduction

    3:41

    The View from Mars: Data Engineering at a Glance

    6:36

    What Does Azure Have to Offer?

    5:00

    Introduction to Data Lakes

    8:46

    Introduction to Azure Data Factory

    6:43

    Introduction to Azure Synapse Analytics

    7:17

    Introduction to Azure Stream Analytics

    8:48

    Introduction to Azure Databricks

    8:24

    Section Recap

    7:47

    1061- DP-203 S02 Quiz Data Engineering Crash Course

    15:00 Quiz
  • Chapter 3 16 Lessons Data Storage 2:10:16

    Section Introduction

    2:32

    Using Azure Data Lakes

    8:06

    Getting the Folder Structure Right

    8:02

    Understanding File Types

    6:29

    Partitioning Data

    5:09

    Partitioning Best Practices: Part 1

    6:58

    Partitioning Best Practices: Part 2

    4:09

    Distributing Data

    7:44

    Archiving Data

    6:02

    Pruning Data

    4:22

    Compressing Data

    5:43

    Sharding Data

    8:48

    Implementing Data Redundancy

    8:00

    Provisioning Azure Data Lake Storage Gen2

    30:00 Hands-On Lab

    Section Recap

    3:12

    DP-203 Quiz: Data Storage

    15:00 Quiz
  • Chapter 4 19 Lessons Data Ingestion and Transformation 2:48:27

    Section Introduction

    3:09

    Meeting the Tools of the Trade: Azure Data Factory

    8:55

    Meeting the Tools of the Trade: Transact-SQL

    5:37

    Meeting the Tools of the Trade: Azure Synapse Pipelines

    3:25

    Meeting the Tools of the Trade: Scala

    5:39

    Meeting the Tools of the Trade: Apache Spark

    3:54

    Creating Data Pipelines

    10:06

    Designing and Creating Tests for Data Pipelines

    4:41

    Integrating Jupyter/Python Notebooks into a Data Pipeline

    6:02

    Cleansing Data

    5:19

    Splitting Data

    6:34

    Shredding JSON

    3:15

    Encoding and Decoding Data

    5:37

    Configuring Error Handling for Transformations

    4:51

    Normalizing and Denormalizing Values

    4:07

    Performing Data Exploratory Analysis

    4:26

    Moving and Transforming Data with Azure Data Factory

    1:00:00 Hands-On Lab

    Section Recap

    7:50

    1061 - DP203 - S04 Data Ingestion and Transformation

    15:00 Quiz
  • Chapter 5 17 Lessons Batch Processing Solutions 2:58:18

    Section Introduction

    3:02

    Identifying Azure Services for Batch Processing

    8:34

    Developing a Batch Processing Solution with Azure Services

    1:00:00 Hands-On Lab

    Designing and Implementing Incremental Data Loads

    6:49

    Designing and Implementing Incremental Data Loads Part 2

    6:11

    Introduction to Data Factory Flow

    5:53

    Handling Schema Drift

    4:58

    Handling Duplicate and Missing Data with Data Flow

    7:22

    Upserting Data with Data Flow

    6:00

    Designing and Configuring Exception Handling

    11:26

    Triggering Batches

    6:59

    Implementing Azure Databricks

    9:14

    Managing Data Pipelines

    7:58

    Implementing Version Control for Pipeline Artifacts

    7:14

    Managing Spark Jobs in a Pipeline

    3:44

    Section Recap

    7:54

    1061 - DP203 - S05 Batch Processing

    15:00 Quiz
  • Chapter 6 17 Lessons Stream Processing Solutions 3:00:32

    Section Introduction

    3:11

    Identifying Azure Services for Stream Processing

    8:15

    Designing a Stream Processing Solution

    8:47

    Developing a Stream Processing Solution with Azure Services

    1:00:00 Hands-On Lab

    Processing Time Series Data

    4:10

    Handling Late-Arriving Data

    9:47

    Designing and Creating Windowed Aggregates: Part 1

    9:09

    Designing and Creating Windowed Aggregates: Part 2

    8:43

    Processing Data by Using Spark Structured Streaming

    11:44

    Monitoring for Performance and Functional Regressions

    8:30

    Processing 1 or More Partitions (Repartitioning)

    9:41

    Handling Interruptions

    3:59

    Designing and Configuring Exception Handling

    1:56

    Upserting Data

    4:56

    Replaying Archived Stream Data

    4:44

    Section Recap

    8:00

    1061 - DP203 - S06 Stream Processing

    15:00 Quiz
  • Chapter 7 10 Lessons Data Serving Layer 1:29:06

    Section Introduction

    1:32

    Understanding Schema and Table Design

    7:52

    Understanding Dimensions

    6:48

    Building External Tables

    5:36

    Designing Analytical Stores

    6:12

    Designing Metastores

    6:42

    Maintaining Metadata

    3:22

    Section Recap

    6:02

    Designing Structures for Analytical Processing in Azure

    30:00 Hands-On Lab

    1061 - DP203 - S07 Data Serving Layer

    15:00 Quiz
  • Chapter 8 16 Lessons Configuring Security and Compliance 2:13:44

    Section Introduction

    2:33

    Understanding Data Protection

    7:01

    Implementing Data Protection with Data Masking

    5:12

    Auditing Data

    4:13

    Configuring Data Retention

    4:28

    Purging Data Based on Business Requirements

    6:08

    Securing Azure Data Lake Through RBAC and ACLs

    9:01

    Managing Identities, Keys, and Secrets

    5:59

    Implementing Secure Endpoints

    3:36

    Encrypting Analytical Stores in Azure

    30:00 Hands-On Lab

    Implementing Resource Tokens in Azure Databricks

    5:01

    Managing Sensitive Information

    5:38

    Loading a DataFrame with Sensitive Information

    5:42

    Section Recap

    9:12

    Configuring Security and Compliance

    15:00 Quiz

    Securing Azure Data Lake Storage Gen2

    15:00 Hands-On Lab
  • Chapter 9 11 Lessons Monitoring Data Storage and Data Processing 1:45:26

    Section Introduction

    2:26

    Configuring Monitoring Services

    7:19

    Measuring Performance of Data Movement

    4:52

    Monitoring and Updating Statistics about Data across a System

    5:32

    Measuring Query Performance

    8:22

    Monitoring Cluster Performance

    3:58

    Understanding Custom Logging Options

    4:03

    Interpreting a Spark Directed Acyclic Graph (DAG)

    5:15

    Section Recap

    3:39

    Monitor Data Storage and Data Processing

    15:00 Quiz

    Monitoring Azure Data Factory Pipeline Performance

    45:00 Hands-On Lab
  • Chapter 10 16 Lessons Optimizing and Troubleshooting Data Storage and Processing 1:46:01

    Section Introduction

    1:53

    Compacting Small Files

    3:51

    Rewriting User-Defined Functions

    3:25

    Handling Skew in Data

    8:13

    Handling Data Spill

    9:09

    Finding Shuffling in a Pipeline

    5:41

    Tuning Shuffle Partitions

    5:46

    Optimizing Resource Management

    5:03

    Tuning Queries by Using Cache

    4:16

    Troubleshooting a Failed Spark Job

    5:01

    Debugging Spark Jobs by Using the Spark UI

    4:09

    Optimizing Pipelines for Analytical or Transactional Purposes

    7:27

    Optimizing Azure Data Factory Jobs

    15:00 Hands-On Lab

    Troubleshooting a Failed Pipeline Run

    4:22

    Section Recap

    7:45

    Optimize and Troubleshoot Data Storage and Processing

    15:00 Quiz
  • Chapter 11 4 Lessons Conclusion 2:09:51

    Bringing It All Together

    2:34

    Exam Tips

    4:25

    Final Notes

    2:52

    Data Engineering on Microsoft Azure

    2:00:00 Quiz

What are Hands-on Labs

What's the difference between theoretical knowledge and real skills? Practical real-world experience. That's where Hands-on Labs come in! Hands-on Labs are guided, interactive experiences that help you learn and practice real-world scenarios in real cloud environments. Hands-on Labs are seamlessly integrated in courses, so you can learn by doing.

Get Started
Who’s going to be learning?
Sign In
Welcome Back!

Psst…this one if you’ve been moved to ACG!