Identifying Iris Plants Using Machine Learning

45 minutes
  • 3 Learning Objectives

About this Hands-on Lab

We are going to replicate a famous experiment performed by Sir Ronald Fisher in his 1936 paper *The use of multiple measurements in taxonomic problems*. Why are we repeating a nearly 100-year-old experiment? Because now we can let a machine learn how to do it instead of doing the math ourselves. Yeah!

In this lab, you will see how to load data using TensorFlow’s dataset API, visualize it using Pandas, and then train a machine learning model using Keras. You will be using Python in a Jupyter notebook, but all of the code is provided. No experience with any of those is required, but some familiarity with programming will help you get more out of this lab.

Learning Objectives

Successfully complete this lab by achieving the following learning objectives:

Load the Data
  1. TensorFlow provides the training data you need in the TensorFlow Datasets API. Using the Datasets API, download the training data and split it into a NumPy array of features and a NumPy array of labels.

  2. TensorFlow also provides the testing data, but it is not available through the API. Download this test data, load it, and split it into a NumPy array of features and a NumPy array of labels.

  3. Pandas provides a convenient way to visualize the data (the next objective), so combine the features and labels in a Pandas DataFrame for the training data, and a DataFrame for the testing data.

Visualize the Data

All of these steps will help you understand the data you are working with more fully:

  1. Print counts of the number of samples in the training and test datasets.

  2. View common statistics about the features in each dataset.

  3. View the raw data from 15 examples.

  4. Plot each feature against all other features to see if there are any natural groupings.

  5. Plot the data to show how strongly each feature separates the data.

Teach the Machine About Irises

Create your model and make predictions!

  1. Create and compile a Keras model for classifying the Iris data. Include accuracy as a model metric for easy evaluation.

  2. Fit the Keras model to the training data.

  3. Evaluate the model’s accuracy on the test data set.

  4. Show the model’s predictions for the test data.

Additional Resources

Data

We will be using the Iris dataset, as provided by TensorFlow. This is hosted on the UCI Machine Learning repository, which has more information about the dataset.

There are 4 features in our data, all measured in centimeters:

  • Sepal length
  • Sepal width
  • Petal length
  • Petal width

The label for our data is the species of the Iris plant described by the measurements. The species is what the machine will be trained to predict.

Logging in to the Lab Environment

We will be using Google Colab for this lab. This lab can be run for free on your personal Google account. Using your personal account, you will have a copy of the notebook to reference later. If you don't want to use your personal account, open a new Incognito or Private browser window.

Go to Colab and log in using either your own account or the Google account credentials provided in the lab.

Notebook

All of the code is provided in Identifying Iris Plants Using Machine Learning. Open this in the window where you logged in and follow the steps in the notebook.

What are Hands-on Labs

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Sign In
Welcome Back!

Psst…this one if you’ve been moved to ACG!

Get Started
Who’s going to be learning?