Loading Data from TensorFlow Datasets

45 minutes
  • 4 Learning Objectives

About this Hands-on Lab

TensorFlow provides easy access to many common public datasets. In this lab, you will learn to load the MNIST database of handwritten digits, a common entry-level machine learning dataset, from TensorFlow Datasets. Using this data, you will build a simple model that will learn to predict numbers found in the images.

Learning Objectives

Successfully complete this lab by achieving the following learning objectives:

Load the MNIST Dataset

Using TensorFlow Datasets, load the MNIST training and testing data into your program. Additionally, load the dataset information provided by TensorFlow.

Explore the MNIST Dataset
  1. Display the dataset information provided by TensorFlow.
  2. Display the class label names.
  3. Display some example images.
Wrangle the MNIST Dataset
  1. Normalize image pixel data to values between 0 and 1.
  2. Since the dataset is small, load all of it into memory for better performance.
  3. Shuffle the training data to help the model generalize, but don’t shuffle the test data.
  4. Batch the data to make training faster.
Teach a Model to Predict Handwritten Digits
  1. Create a basic Keras Sequential deep neural network to predict the number in each image.
  2. Compile the model with an appropriate optimizer and loss function.
  3. Train the model for a few epochs using the training data.
  4. Evaluate the model using the test data.
  5. Save the model for later use.

Additional Resources

Scenario

Your company has many images of old faxes. Currently, these images are not very useful as a person has to manually read these to extract any information from them. You have been tasked with creating a machine learning model that can recognize the numbers written on these old documents.

You can begin by training your model on the MNIST database of handwritten digits.

Lab Goals

  1. Load the MNIST dataset.
  2. Explore the MNIST dataset.
  3. Wrangle the MNIST dataset.
  4. Teach a model to predict handwritten digits.

Logging In to the Lab Environment

No environment is provided for this lab. This lab is meant to be completed in PyCharm running on your own hardware in preparation for the TensorFlow Developer Certificate Exam. If you don't have PyCharm or are not working toward the certification, you can use the provided Google account credentials, or your own account, to complete the tasks in Google Colab, which provides free hosting and compute power for Jupyter Notebooks.

If you use the provided credentials to access Colab, make sure to save a copy of your work locally before the end of the lab.

What are Hands-on Labs

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Sign In
Welcome Back!

Psst…this one if you’ve been moved to ACG!

Get Started
Who’s going to be learning?