In this lab, we will load a CSV file into a pandas DataFrame. Once loaded, we will remove rows with an `age` more than 3 standard deviations from the mean and rows with `hours-per-week` below the 10% and above the 90% quantiles. We will then write the cleansed data to a file.
Basic Python programming skills will be required for this lab. If you need a refresher, check out the following course:
– [Certified Associate in Python Programming Certification](https://acloud.guru/overview/8169e8e7-91a7-4d92-b278-4dd08c787dc6)
Learning Objectives
Successfully complete this lab by achieving the following learning objectives:
- Load the Data File
Load the
data.csv
file into a pandas DataFrame.- Resolve Outlying age Values
Remove rows with an
age
more than 3 standard deviations from the mean.- Resolve Outlying hours-per-week Values
Remove rows with
hours-per-week
below the 10% and above the 90% quantiles.- Write the Data to a New File
Write the data to a new file named
cleaned_data.csv
.