Lab
A Cloud Guru

Generate a Complete Report

In this lab, graphs are created from data sliced from Titanic survivability CSV files. The PDF of the notebook for this lab is [here.](https://github.com/linuxacademy/content-python-for-database-and-reporting/blob/master/pdf/hol_5_1_l_solution.pdf)

Try for free Contact sales

Path Info

Level

Intermediate

Duration

1h 30m

Published

Mar 13, 2020

Challenge

Start Jupyter Notebook Server and Access on Your Local Machine
Connecting to the Jupyter Notebook Server

Make sure that you have activated the virtual environment!
1. To activate the virtual environment:
```
conda activate base
```
1. To start the server run the following:
```
python get_notebook_token.py
```
This is a simple script that starts the Jupyter notebook server and sets it to continue to run outside of the terminal.

Note: On the terminal is a token, please copy this and save it to a text file on your local machine.

On Your Local Machine
1. In a terminal window, enter the following:
```
ssh -N -L localhost:8087:localhost:8086 cloud_user@<the public IP address of the Playground server>
```
It will ask you for your password; this is the password you used to log in to the Playground remote server. Leave this terminal open. It will appear nothing has happened, but it must remain open while you use the Jupyter Notebook server in this session.
1. In the browser of your choice, enter the following address:
http://localhost:8087

This will open a Jupyter Notebook site that asks for the token you copied from the remote server.
Challenge

Import Required Packages and Create Dataframe From File
Titanic Data: Factors Affecting Survivability

This data was collected from a web search. It is available from many different organizations. The data provides specific data about passengers on the Titanic and whether they survived the disaster or not.

The various data available is defined as:
- PassengerId - Indexed starting at 1
- Survived - Survival (0 = No; 1 = Yes)
- Pclass - Passenger Class (1 = 1st; 2 = 2nd; 3 = 3rd)
- Name - Name
- Sex - Sex
- Age - Age
- SibSp - Number of Siblings/Spouses Aboard
- Parch - Number of Parents/Children Aboard
- Ticket - Ticket Number
- Fare - Passenger Fare
- Cabin - Cabin
- Embarked - Port of Embarkation (C = Cherbourg; Q = Queenstown; S = Southampton)
The questions we are asking:
1. What part did age play?
2. What part did gender play?
3. Did the passenger class make a difference?
Load the CSV Data Into a Dataframe
```
import matplotlib.pyplot as plt
import pandas as pd

%matplotlib inline

titanic_df = pd.read_csv('titanic.csv')
                       
titanic_df.head()
```

Challenge

Examine the Effect Age Had on Survivability

Examine The Effect of Age on Survivability

Under 12
13 - 24
25 - 49
50 - 74
75 and Older

#### Under 12
passengers_under_12 = titanic_df[titanic_df.Age < 12]
passengers_under_12_survived = passengers_under_12[passengers_under_12.Survived == 1]
passengers_under_12_percent_survived = passengers_under_12_survived.Age.count() / passengers_under_12.Age.count()

# Under 13 - 24
passengers_13_to_24 = titanic_df[(titanic_df.Age >= 13) & (titanic_df.Age < 25)]
passengers_13_to_24_survived = passengers_13_to_24[passengers_13_to_24.Survived == 1]
passengers_13_to_24_percent_survived = passengers_13_to_24_survived.Age.count() / passengers_13_to_24.Age.count()

# 25 to 49
passengers_25_to_49 = titanic_df[(titanic_df.Age >= 25) & (titanic_df.Age < 50)]
passengers_25_to_49_survived = passengers_25_to_49[passengers_25_to_49.Survived == 1]
passengers_25_to_49_percent_survived = passengers_25_to_49_survived.Age.count() / passengers_25_to_49.Age.count()

# 50 to 74
passengers_50_to_74 = titanic_df[(titanic_df.Age >= 50) & (titanic_df.Age < 74)]
passengers_50_to_74_survived = passengers_50_to_74[passengers_50_to_74.Survived == 1]
passengers_50_to_74_percent_survived = passengers_50_to_74_survived.Age.count() / passengers_50_to_74.Age.count()

# 75 and over
passengers_75_over = titanic_df[titanic_df.Age > 74]
passengers_75_over_survived = passengers_75_over[passengers_75_over.Survived == 1]
passengers_75_over_percent_survived = passengers_75_over_survived.Age.count() / passengers_75_over.Age.count()

print(f'Under 12:\t{passengers_under_12.Age.count()} - {passengers_under_12_percent_survived}')
print(f'13 - 24:\t{passengers_13_to_24.Age.count()} - {passengers_13_to_24_percent_survived}')
print(f'25 - 49:\t{passengers_25_to_49.Age.count()} - {passengers_25_to_49_percent_survived}')
print(f'50 - 74:\t{passengers_50_to_74.Age.count()} - {passengers_50_to_74_percent_survived}')
print(f'75 & Over:\t{passengers_75_over.Age.count()} - {passengers_75_over_percent_survived}')

# Show data as a bar chart
groups = ('Under 12', '13 - 24', '25 - 49', '50 - 74', '75 & Over')
percentages = [0.57, 0.37, 0.41, 0.36, 1]
plt.bar(groups, percentages, align='center', alpha=0.5)
plt.ylabel("Percent Survived")
plt.title("Titanic Survivablity by Age Group")

This suggests that children under 13 may have been given some preferential treatment for lifeboats. However, it is not clear if survivability is only those that died in the event. It may be that some of the children may have been more susceptible to environmental factors, such as temperature, and died in the lifeboat.

Since there was only one passenger in the 75 & Over group, the survivability of that group is not useful and should not be considered.

Challenge

Examine the Effect Gender Had on Survivability

Examine the Effect of Gender on Survivability

#### Male
passengers_male = titanic_df[titanic_df.Sex == "male"]
passengers_male_survived = passengers_male[passengers_male.Survived == 1]
passengers_male_percent_survived = passengers_male_survived.Sex.count() / passengers_male.Sex.count()

#### Female
passengers_female = titanic_df[titanic_df.Sex == "female"]
passengers_female_survived = passengers_female[passengers_female.Survived == 1]
passengers_female_percent_survived = passengers_female_survived.Sex.count() / passengers_female.Sex.count()

print(f'Male:\t{passengers_male.Sex.count()} - {passengers_male_percent_survived}')
print(f'Female:\t{passengers_female.Sex.count()} - {passengers_female_percent_survived}')

# Show data as a bar chart
groups = ('Male', 'Female')
percentages = [0.18, 0.74]
plt.bar(groups, percentages, align='center', alpha=0.5)
plt.ylabel("Percent Survived")
plt.title("Titanic Survivablity by Gender")

It is obvious female passengers were given preference over male passengers for lifeboats. It would be interesting to break down the male survivors by age group. Hypothesis: Younger males survived at a higher rate.

Challenge

Examine the Effect Passenger Class Had on Survivability

Examine the Effect of Passenger Class on Survivability

#### Passenger Class 1
passengers_class_1 = titanic_df[titanic_df.Pclass == 1]
passengers_class_1_survived = passengers_class_1[passengers_class_1.Survived == 1]
passengers_class_1_percent_survived = passengers_class_1_survived.Pclass.count() / passengers_class_1.Pclass.count()

#### Passenger Class 2
passengers_class_2 = titanic_df[titanic_df.Pclass == 2]
passengers_class_2_survived = passengers_class_2[passengers_class_2.Survived == 1]
passengers_class_2_percent_survived = passengers_class_2_survived.Pclass.count() / passengers_class_2.Pclass.count()

#### Passenger Class 3
passengers_class_3 = titanic_df[titanic_df.Pclass == 3]
passengers_class_3_survived = passengers_class_3[passengers_class_3.Survived == 1]
passengers_class_3_percent_survived = passengers_class_3_survived.Pclass.count() / passengers_class_3.Pclass.count()


print(f'Class 1:\t{passengers_class_1.Pclass.count()} - {passengers_class_1_percent_survived}')
print(f'Class 2:\t{passengers_class_2.Pclass.count()} - {passengers_class_2_percent_survived}')
print(f'Class 3:\t{passengers_class_3.Pclass.count()} - {passengers_class_3_percent_survived}')

# Show data as a bar chart
groups = ('Class 1', 'Class 2', 'Class 3')
percentages = [0.63, 0.47, 0.24]
plt.bar(groups, percentages, align='center', alpha=0.5)
plt.ylabel("Percent Survived")
plt.title("Titanic Survivablity by Passenger Class")

It is clear that Class 1 passengers were more likely to be saved, whether they were closer to the lifeboats or a genuine preference cannot be determined. Once again, looking at this data by age and gender would be interesting for further study.

This is not an exhaustive review of the data available, but a simple review based on three independent attributes. Much more data could be analyzed for deeper, more specific ideas of how the surviving passengers were selected.

Author

A Cloud Guru

The Cloud Content team comprises subject matter experts hyper focused on services offered by the leading cloud vendors (AWS, GCP, and Azure), as well as cloud-related technologies such as Linux and DevOps. The team is thrilled to share their knowledge to help you build modern tech solutions from the ground up, secure and optimize your environments, and so much more!

What's a lab?

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Provided environment for hands-on practice

We will provide the credentials and environment necessary for you to practice right within your browser.

Guided walkthrough

Follow along with the author’s guided walkthrough and build something new in your provided environment!

Did you know?

On average, you retain 75% more of your learning if you get time for practice.

Start learning by doing today

View Plans

Generate a Complete Report

Path Info

Table of Contents

Start Jupyter Notebook Server and Access on Your Local Machine

Connecting to the Jupyter Notebook Server

On Your Local Machine

Import Required Packages and Create Dataframe From File

Titanic Data: Factors Affecting Survivability

Load the CSV Data Into a Dataframe

Examine the Effect Age Had on Survivability

Examine The Effect of Age on Survivability

Examine the Effect Gender Had on Survivability

Examine the Effect of Gender on Survivability

Examine the Effect Passenger Class Had on Survivability

Examine the Effect of Passenger Class on Survivability

What's a lab?

Provided environment for hands-on practice

Guided walkthrough

Did you know?

Start learning by doing today