An Analysis of Superbowl TV Shows

1. TV, halftime shows, and the Big Game

# Import pandas
import pandas as pd

# Load the CSV data into DataFrames
super_bowls = pd.read_csv('datasets/super_bowls.csv')
tv = pd.read_csv('datasets/tv.csv')
halftime_musicians = pd.read_csv('datasets/halftime_musicians.csv')

# Display the first five rows of each DataFrame
display(super_bowls.head())
display(tv.head())
display(halftime_musicians.head())

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

%%nose
# %%nose needs to be included at the beginning of every @tests cell

# One or more tests of the student's code
# The @solution should pass the tests
# The purpose of the tests is to try to catch common errors and
# to give the student a hint on how to resolve these errors

def test_pandas_loaded():
    assert 'pd' in globals(), \
    'Did you import the pandas module under the alias pd?'
    
def test_super_bowls_correctly_loaded():
    correct_super_bowls = pd.read_csv('datasets/super_bowls.csv')
    assert correct_super_bowls.equals(super_bowls), "The variable super_bowls does not contain the data in super_bowls.csv."

def test_tv_correctly_loaded():
    correct_tv = pd.read_csv('datasets/tv.csv')
    assert correct_tv.equals(tv), "The variable tv does not contain the data in tv.csv."
    
def test_halftime_musicians_correctly_loaded():
    correct_halftime_musicians = pd.read_csv('datasets/halftime_musicians.csv')
    assert correct_halftime_musicians.equals(halftime_musicians), "The variable halftime_musicians does not contain the data in halftime_musicians.csv."

4/4 tests passed

2. Taking note of dataset issues

# Summary of the TV data to inspect
tv.info()

print('\n')

# Summary of the halftime musician data to inspect
halftime_musicians.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 53 entries, 0 to 52
Data columns (total 9 columns):
super_bowl          53 non-null int64
network             53 non-null object
avg_us_viewers      53 non-null int64
total_us_viewers    15 non-null float64
rating_household    53 non-null float64
share_household     53 non-null int64
rating_18_49        15 non-null float64
share_18_49         6 non-null float64
ad_cost             53 non-null int64
dtypes: float64(4), int64(4), object(1)
memory usage: 3.8+ KB


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 134 entries, 0 to 133
Data columns (total 3 columns):
super_bowl    134 non-null int64
musician      134 non-null object
num_songs     88 non-null float64
dtypes: float64(1), int64(1), object(1)
memory usage: 3.2+ KB

%%nose
# %%nose needs to be included at the beginning of every @tests cell

# One or more tests of the student's code
# The @solution should pass the tests
# The purpose of the tests is to try to catch common errors and
# to give the student a hint on how to resolve these errors

# def test_nothing_task_2():
#     assert True, "Nothing to test."

last_input = In[-2]

def test_not_empty_task_2():
    assert "# ... YOUR CODE FOR TASK" not in last_input, \
        "It appears that # ... YOUR CODE FOR TASK X ... is still in the code cell, which suggests that you might not have attempted the code for this task. If you have, please delete # ... YOUR CODE FOR TASK X ... from the cell and resubmit."

1/1 tests passed

3. Combined points distribution

# Import matplotlib and set plotting style
from matplotlib import pyplot as plt
%matplotlib inline
plt.style.use('seaborn')

# Plot a histogram of combined points
# ... YOUR CODE FOR TASK 3 ...
plt.hist(super_bowls['combined_pts'])
plt.xlabel('Combined Points')
plt.ylabel('Number of Super Bowls')
plt.show()

# Display the Super Bowls with the highest and lowest combined scores
display(super_bowls[super_bowls['combined_pts'] > 70])
display(super_bowls[super_bowls['combined_pts']  < 25])

/usr/local/lib/python3.6/dist-packages/matplotlib/figure.py:2299: UserWarning: This figure includes Axes that are not compatible with tight_layout, so results might be incorrect.
  warnings.warn("This figure includes Axes that are not compatible "

png

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

%%nose
# %%nose needs to be included at the beginning of every @tests cell

# One or more tests of the student's code
# The @solution should pass the tests
# The purpose of the tests is to try to catch common errors and
# to give the student a hint on how to resolve these errors

def test_matplotlib_loaded():
    assert 'plt' in globals(), \
    'Did you import the pyplot module from matplotlib under the alias plt?'

1/1 tests passed

4. Point difference distribution

# Plot a histogram of point differences
plt.hist(super_bowls.difference_pts)
plt.xlabel('Point Difference')
plt.ylabel('Number of Super Bowls')
plt.show()



# Display the closest game(s) and biggest blowouts
display(super_bowls[super_bowls.difference_pts == 1])
display(super_bowls[super_bowls.difference_pts >= 35])

/usr/local/lib/python3.6/dist-packages/matplotlib/figure.py:2299: UserWarning: This figure includes Axes that are not compatible with tight_layout, so results might be incorrect.
  warnings.warn("This figure includes Axes that are not compatible "

png

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

%%nose
# %%nose needs to be included at the beginning of every @tests cell

# One or more tests of the student's code
# The @solution should pass the tests
# The purpose of the tests is to try to catch common errors and
# to give the student a hint on how to resolve these errors

last_input = In[-2]

def test_not_empty_task_4():
    assert "# ... YOUR CODE FOR TASK" not in last_input, \
        "It appears that # ... YOUR CODE FOR TASK X ... is still in the code cell, which suggests that you might not have attempted the code for this task. If you have, please delete # ... YOUR CODE FOR TASK X ... from the cell and resubmit."

1/1 tests passed

5. Do blowouts translate to lost viewers?

# Join game and TV data, filtering out SB I because it was split over two networks
games_tv = pd.merge(tv[tv['super_bowl'] > 1], super_bowls, on='super_bowl')

# Import seaborn
import seaborn as sns


# Create a scatter plot with a linear regression model fit
sns.regplot(x=games_tv.difference_pts, y=games_tv.share_household, data=games_tv)

<matplotlib.axes._subplots.AxesSubplot at 0x7f35cfaaa5c0>



/usr/local/lib/python3.6/dist-packages/matplotlib/figure.py:2299: UserWarning: This figure includes Axes that are not compatible with tight_layout, so results might be incorrect.
  warnings.warn("This figure includes Axes that are not compatible "

png

%%nose
# %%nose needs to be included at the beginning of every @tests cell

# One or more tests of the student's code
# The @solution should pass the tests
# The purpose of the tests is to try to catch common errors and
# to give the student a hint on how to resolve these errors

last_value = _

def test_seaborn_loaded():
    assert 'sns' in globals(), \
    'Did you import the seaborn module under the alias sns?'

def test_plot_exists_5():
    try:
        assert type(last_value) == type(sns.regplot(x='difference_pts', y='share_household', data=games_tv))
    except AssertionError:
        assert False, 'A plot was not the last output of the code cell.'

2/2 tests passed




/usr/local/lib/python3.6/dist-packages/matplotlib/figure.py:2299: UserWarning: This figure includes Axes that are not compatible with tight_layout, so results might be incorrect.
  warnings.warn("This figure includes Axes that are not compatible "

png

6. Viewership and the ad industry over time

# Create a figure with 3x1 subplot and activate the top subplot
plt.subplot(3, 1, 1)
plt.plot(tv.super_bowl, tv.avg_us_viewers, color='#648FFF')
plt.title('Average Number of US Viewers')

# Activate the middle subplot
plt.subplot(3, 1, 2)
plt.plot(tv.super_bowl, tv.rating_household, color='#DC267F')
plt.title('Household Rating')

# Activate the bottom subplot
plt.subplot(3, 1, 3)
plt.plot(tv.super_bowl, tv.ad_cost, color='#FFB000')
plt.title('Ad Cost')
plt.xlabel('SUPER BOWL')

# Improve the spacing between subplots
plt.tight_layout()

/usr/local/lib/python3.6/dist-packages/matplotlib/figure.py:2299: UserWarning: This figure includes Axes that are not compatible with tight_layout, so results might be incorrect.
  warnings.warn("This figure includes Axes that are not compatible "

png

%%nose
# %%nose needs to be included at the beginning of every @tests cell

# One or more tests of the student's code
# The @solution should pass the tests
# The purpose of the tests is to try to catch common errors and
# to give the student a hint on how to resolve these errors

last_input = In[-2]

def test_not_empty_task_6():
    assert "# ... YOUR CODE FOR TASK" not in last_input, \
        "It appears that # ... YOUR CODE FOR TASK X ... is still in the code cell, which suggests that you might not have attempted the code for this task. If you have, please delete # ... YOUR CODE FOR TASK X ... from the cell and resubmit."

1/1 tests passed

7. Halftime shows weren’t always this great

# Display all halftime musicians for Super Bowls up to and including Super Bowl XXVII
halftime_musicians[halftime_musicians["super_bowl"] <= 27]

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

%%nose
# %%nose needs to be included at the beginning of every @tests cell

# One or more tests of the student's code
# The @solution should pass the tests
# The purpose of the tests is to try to catch common errors and
# to give the student a hint on how to resolve these errors

last_value = _
        
def test_head_output():
    try:
        assert "Wynonna Judd" not in last_value.to_string()
    except AttributeError:
        assert False, "Please do not use the display() or print() functions to display the filtered DataFrame. Write your line of code as the last line in the cell instead."
    except AssertionError:
        assert False, "Hmm, it seems halftime_musicians wasn't filtered correctly and/or displayed as the last output of the cell. Michael Jackson's performance should be the first row displayed."

1/1 tests passed

8. Who has the most halftime show appearances?

# Count halftime show appearances for each musician and sort them from most to least
halftime_appearances = halftime_musicians.groupby('musician').count()['super_bowl'].reset_index()
halftime_appearances = halftime_appearances.sort_values('super_bowl', ascending=False)

# Display musicians with more than one halftime show appearance
halftime_appearances[halftime_appearances["super_bowl"] >= 2]

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

%%nose
# %%nose needs to be included at the beginning of every @tests cell

# One or more tests of the student's code
# The @solution should pass the tests
# The purpose of the tests is to try to catch common errors and
# to give the student a hint on how to resolve these errors

last_value = _

def test_filter_correct_8():
    try:
        assert len(last_value) == 14
    except TypeError:
        assert False, "Hmm, it seems halftime_appearances wasn't filtered correctly and/or displayed as the last line of code in the cell (i.e., displayed without the display() or print() functions). There should be 14 repeat halftime show acts."
    except AssertionError:
        assert False, "Hmm, it seems halftime_appearances wasn't filtered correctly. There should be 14 repeat halftime show acts."

1/1 tests passed

9. Who performed the most songs in a halftime show?

# Filter out most marching bands
no_bands = halftime_musicians[~halftime_musicians.musician.str.contains('Marching')]
no_bands = no_bands[~no_bands.musician.str.contains('Spirit')]

# Plot a histogram of number of songs per performance
most_songs = int(max(no_bands['num_songs'].values))
plt.hist(no_bands.num_songs.dropna(), bins=most_songs)
plt.xlabel('Number of Songs Per Halftime Show Performance')
plt.ylabel('Number of Musicians')
plt.show()

# Sort the non-band musicians by number of songs per appearance...
no_bands = no_bands.sort_values('num_songs', ascending=False)
# ...and display the top 15
display(no_bands.head(15))

/usr/local/lib/python3.6/dist-packages/matplotlib/figure.py:2299: UserWarning: This figure includes Axes that are not compatible with tight_layout, so results might be incorrect.
  warnings.warn("This figure includes Axes that are not compatible "

png

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

%%nose
# %%nose needs to be included at the beginning of every @tests cell

# One or more tests of the student's code
# The @solution should pass the tests
# The purpose of the tests is to try to catch common errors and
# to give the student a hint on how to resolve these errors

last_input = In[-2]

def test_not_empty_task_9():
    assert "# ... YOUR CODE FOR TASK" not in last_input, \
        "It appears that # ... YOUR CODE FOR TASK X ... is still in the code cell, which suggests that you might not have attempted the code for this task. If you have, please delete # ... YOUR CODE FOR TASK X ... from the cell and resubmit."

1/1 tests passed

10. Conclusion

# 2018-2019 conference champions
patriots = 'New England Patriots'
rams = 'Los Angeles Rams'

# Who will win Super Bowl LIII?
super_bowl_LIII_winner = patriots
print('The winner of Super Bowl LIII will be the', super_bowl_LIII_winner)

The winner of Super Bowl LIII will be the New England Patriots

%%nose
# %%nose needs to be included at the beginning of every @tests cell

# One or more tests of the student's code
# The @solution should pass the tests
# The purpose of the tests is to try to catch common errors and
# to give the student a hint on how to resolve these errors

def test_valid_winner_chosen():
    assert super_bowl_LIII_winner == 'New England Patriots' or super_bowl_LIII_winner == 'Los Angeles Rams', \
    "It appears a valid potential winner was not selected. Please assign the patriots variable or the rams variable to super_bowl_LIII_winner."

1/1 tests passed

1. TV, halftime shows, and the Big Game#

2. Taking note of dataset issues#

3. Combined points distribution#

4. Point difference distribution#

5. Do blowouts translate to lost viewers?#

6. Viewership and the ad industry over time#

7. Halftime shows weren’t always this great#

8. Who has the most halftime show appearances?#

9. Who performed the most songs in a halftime show?#

10. Conclusion#

1. TV, halftime shows, and the Big Game

2. Taking note of dataset issues

3. Combined points distribution

4. Point difference distribution

5. Do blowouts translate to lost viewers?

6. Viewership and the ad industry over time

7. Halftime shows weren’t always this great

8. Who has the most halftime show appearances?

9. Who performed the most songs in a halftime show?

10. Conclusion