Preprocess Data with the scikit-learn Python Package

30 minutes
  • 3 Learning Objectives

About this Hands-on Lab

In this lab, we will load a dataset from a SQLite database into a pandas DataFrame. Once loaded, we will standardize the dataset using the `StandardScaler()` method and write it to a new table within the SQLite database.

Basic Python programming skills will be required for this lab. If you need a refresher, check out the following course:
– [Certified Associate in Python Programming Certification](https://acloud.guru/overview/8169e8e7-91a7-4d92-b278-4dd08c787dc6)

Learning Objectives

Successfully complete this lab by achieving the following learning objectives:

Load the Data

Load the data from the provided SQLite database (data.db) into a pandas DataFrame object.

Scale the Data

Use the StandardScaler() method of the scikit-learn preprocessing package to scale the data such that the distribution is now centered around 0, with a standard deviation of 1.

Save the Data

Write the scaled dataset to a new table named data-scaled in the SQLite database.

Additional Resources

The Scenario

You are working as a database admin, and just as the ruckus in the breakroom from last week has finally settled down, a junior developer finds a machine learning algorithm that promises to predict how the age of the workforce will change in the future. However, they are having issues with the scale of the data.

Thankfully, you have learned some methods to standardize data using pandas in Python from the awesome courses on acloud.guru!

You will take the following steps to scale the data:

  • Load data from the SQLite database (data.db).
  • Scale data using the StandardScaler.
  • Save data to a new table in the database.

Log in to the server over SSH using the credentials provided.

The data.db file is already available in the lab instance, but if you'd like to follow along on another machine, you may download it from here.

This data was sourced from the Center for Machine Learning and Intelligent Systems. Learn more here.

What are Hands-on Labs

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Sign In
Welcome Back!

Psst…this one if you’ve been moved to ACG!

Get Started
Who’s going to be learning?