Simple pandas Operations in Jupyter Notebook

45 minutes
  • 6 Learning Objectives

About this Hands-on Lab

Explore using pandas to explore a set of data and the information that can be gleaned from it. In this lab, we use `.describe` and `.head` to determine what the data looks like.

We will also use `.concat` to add a calculated column and `boolean slicing` to further look for insights into what the data shows us.

The PDF of the notebook for this lab is [here.](https://github.com/linuxacademy/content-python-for-database-and-reporting/blob/master/pdf/hol_2_2_L_solution.pdf)

Learning Objectives

Successfully complete this lab by achieving the following learning objectives:

Start Jupyter Notebook Server and Access on the Local Machine

Connecting to the Jupyter Notebook Server

Make sure the virtual environment it activated!

To activate the virtual environment:

conda activate base

To start the server:

python get_notebook_token.py

This is a simple script that starts the jupyter notebook server and sets it to continue to run outside of the terminal.

On the terminal is a token. Please copy this and save it to a text file on the local machine.

On the Local Machine

In a terminal window, enter the following:

ssh -N -L localhost:8087:localhost:8086 cloud_user@<the public IP address of the Playground server>

It will ask for a password. This is the password used to log in to the Playground remote server.

Leave this terminal open, it will appear nothing has happened, but it must remain open while using the Jupyter Notebook server in this session.

In the browser, enter http://localhost:8087 in the address bar. This will open a Jupyter Notebook site that asks for the token copied from the remote server.

Examine the Data

Open the notebook hol_2_2_l.

In the cell below, take the data and make it a pandas DataFrame.

import pandas as pd
restaurant_sales_data = pd.read_csv("./tips_csv.txt", header=0)

Examine the data using head and describe.

restaurant_sales_data.describe()
restaurant_sales_data.head()
Tip Percent

Create a series with the tip percent.

percent_tip = pd.Series(restaurant_sales_data['tip']/restaurant_sales_data['meal_total'],name='tip_percent')

Create a new DataFrame containing the original data and the new tip percents.

rsd_per_tips = pd.concat([restaurant_sales_data, percent_tip], axis=1)

Use head to make sure the DataFrame looks correct.

rsd_per_tips.head()

Determine how many are above 25 percent.

rsd_per_tips[rsd_per_tips.tip_percent>=0.25].count()
Waitstaff

Use unique to determine the names of the waitstaff.

restaurant_sales_data.wait_staff.unique()
Tips Recieved

For each waitstaff, determine the total tips and the average tips.

Create a dataframe for each waitstaff that holds only their personal data.

marcia = restaurant_sales_data[restaurant_sales_data.wait_staff=='Marcia']
jan = restaurant_sales_data[restaurant_sales_data.wait_staff=='Jan']
greg = restaurant_sales_data[restaurant_sales_data.wait_staff=='Greg']
bobby = restaurant_sales_data[restaurant_sales_data.wait_staff=='Bobby']
peter = restaurant_sales_data[restaurant_sales_data.wait_staff=='Peter']
cindy = restaurant_sales_data[restaurant_sales_data.wait_staff=='Cindy']

Calculate and print the total tips and average tips for each staff member.

print(f"Marcia:t ${marcia.tip.sum():.2f}t Average: ${marcia.tip.mean():.2f}")
print(f"Jan:t ${jan.tip.sum():.2f}t Average: ${jan.tip.mean():.2f}")
print(f"Greg:t ${greg.tip.sum():.2f}t Average: ${greg.tip.mean():.2f}")
print(f"Bobby:t ${bobby.tip.sum():.2f}t Average: ${bobby.tip.mean():.2f}")
print(f"Peter:t ${peter.tip.sum():.2f}t Average: ${peter.tip.mean():.2f}")
print(f"Cindy:t ${cindy.tip.sum():.2f}t Average: ${cindy.tip.mean():.2f}")
Extra Credit

Determine the average tip percent based on weekday and meal type.

rsd_per_tips.groupby(['weekday', 'meal_type']).tip_percent.mean()

Additional Resources

Examining Server Performance at a Local Restaurant

You are operating as a freelance data scientist and have been hired by the owner of a local restaurant to help her make sense of data she has been keeping. She wants to know if there are insights to make about the service staff based on sales and tips.

What are Hands-on Labs

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Get Started
Who’s going to be learning?

How many seats do you need?

  • $499 USD per seat per year
  • Billed Annually
  • Renews in 12 months

Ready to accelerate learning?

For over 25 licenses, a member of our sales team will walk you through a custom tailored solution for your business.


$2,495.00

Checkout
Sign In
Welcome Back!
Thanks for reaching out!

You’ll hear from us shortly. In the meantime, why not check out what our customers have to say about ACG?