Week 7 HW

Author
Affiliation

Ben Akyrueklier

George Washington University

Published

October 13, 2025

Code
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import altair as alt
from sklearn.datasets import load_iris
import plotly.express as px
import plotly.io as pio
pio.renderers.default='plotly_mimetype+notebook_connected'
import warnings
warnings.filterwarnings('ignore')
Code
wb = pd.read_csv("Data/WBnew.csv")
new_column_names = {'2015 [YR2015]': '2015', '2016 [YR2016]': '2016', '2017 [YR2017]': '2017', '2018 [YR2018]': '2018', '2019 [YR2019]': '2019'}
wb1519 = wb.rename(columns=new_column_names)
wb1519 = wb1519.drop(columns=['2005 [YR2005]', '2006 [YR2006]', '2007 [YR2007]', '2008 [YR2008]', '2009 [YR2009]', '2010 [YR2010]', '2011 [YR2011]', '2012 [YR2012]', '2013 [YR2013]', '2014 [YR2014]', '2020 [YR2020]', '2021 [YR2021]', '2022 [YR2022]', '2023 [YR2023]', '2024 [YR2024]'])
wbmelt = pd.melt(wb1519, id_vars=['Country Name','Series Name'], value_vars=['2015', '2016', '2017', '2018', '2019'], var_name='Year', value_name='Value')
wbmelt = wbmelt[wbmelt['Country Name'].isin(['Japan', 'France', 'Brazil', 'United States', 'Canada', 'China'])]
wbmelt = wbmelt.dropna()
wbpivot = wbmelt.pivot(index=['Country Name', 'Year'], columns='Series Name', values='Value').reset_index()
wbpivot = wbpivot.dropna(axis=1, how='all')
wbpivot.head()
Series Name Country Name Year GDP per capita (current US$) Hospital beds (per 1,000 people) Income share held by highest 10% Life expectancy at birth, total (years) Net migration Real interest rate (%) Researchers in R&D (per million people) Secure Internet servers (per 1 million people)
0 Brazil 2015 8936.19661712113 2.35 40.9 75.106 -173611 33.8323439727973 .. 161.164815967859
1 Brazil 2016 8836.28652735657 2.32 42.1 75.081 -92989 40.6983614262467 .. 415.986539467638
2 Brazil 2017 10080.5092819305 2.3 42 75.383 -156296 41.7138078856955 .. 1605.82544177505
3 Brazil 2018 9300.66164923219 2.26 42.5 75.633 -230334 33.1023342519639 .. 2069.60200203718
4 Brazil 2019 9029.83326681073 2.24 42 75.809 -129216 31.9030727578921 .. 2788.39613470957
Code
hos = wbmelt[wbmelt['Series Name'].isin(['Hospital beds (per 1,000 people)'])]
hos['Value'] = pd.to_numeric(hos['Value'])
hos['Year'] = pd.to_numeric(hos['Year'])
hos = hos[hos['Country Name'].isin(['France', 'Brazil', 'United States', 'Canada', 'China'])]
brazil = hos[hos['Country Name'] == 'Brazil']
china = hos[hos['Country Name'] == 'China']
canada = hos[hos['Country Name'] == 'Canada']
usa = hos[hos['Country Name'] == 'United States']
france = hos[hos['Country Name'] == 'France']
fig, ax = plt.subplots()
ax.plot(brazil['Year'], brazil['Value'], marker='o', label='Brazil')
ax.plot(china['Year'], china['Value'], marker='o', label='China')
ax.plot(canada['Year'], canada['Value'], marker='o', label='Canada')
ax.plot(usa['Year'], usa['Value'], marker='o', label='United States')
ax.plot(france['Year'], france['Value'], marker='o', label='France')
ax.set_xlabel('Year')
ax.set_ylabel('Hospital beds (per 1,000 people)')
ax.set_title('Hospital Beds per 1,000 People (2015–2019)')
ax.set_xticks(hos['Year'])
ax.legend()
for i in range(len(china) - 1):
    delta = china['Value'].iloc[i + 1] - china['Value'].iloc[i]
    plt.text(china['Year'].iloc[i] + 0.5, 
             china['Value'].iloc[i] + 0.3, 
             "+" + str(round(delta, 2)) if delta >= 0 else str(round(delta, 2)), 
             ha='center', color='green')
for i in range(len(france) - 1):
    delta = france['Value'].iloc[i + 1] - france['Value'].iloc[i]
    plt.text(france['Year'].iloc[i] + 0.5, 
             france['Value'].iloc[i] - 0.3, 
             "+" + str(round(delta, 2)) if delta >= 0 else str(round(delta, 2)), 
             ha='center', color='red')
ax.legend(loc='center left', bbox_to_anchor=(1, 0.5))
plt.show()

Code
wb19 = wbpivot[wbpivot['Year'].isin(['2019'])]
wb19["Life expectancy at birth, total (years)"] = pd.to_numeric(wb19["Life expectancy at birth, total (years)"])
hospital_sort = wb19.sort_values("Life expectancy at birth, total (years)")
x=hospital_sort["Country Name"]
y=hospital_sort["Life expectancy at birth, total (years)"]

plt.plot(x, y, 'o')
plt.ylim(y.min()*0.9, y.max()*1.1)
plt.ylabel("Life Expectancy at Birth (Years)") 
plt.xlabel("Country")
plt.title("Life Expectancy (2019)")
for xi, yi in zip(x, y):
    plt.text(xi, yi+0.6, str(round(yi, 2)), ha='center')
plt.show()

Code
interest=wb1519[wb1519['Country Name'].isin(['Australia', 'United States', 'China'])]
interest=interest[interest['Series Name'].isin(['Real interest rate (%)'])]
interest=interest.drop(columns=['Country Code', 'Series Code', 'Series Name'])
interest
Country Name 2015 2016 2017 2018 2019
87 Australia 6.23777334791279 6.12460674928936 1.53619552626779 3.32958805124833 1.57806683471944
335 China 4.20696426973205 2.83073632160298 0.19585733630837 0.852537111572747 2.99509285161556
1655 United States 2.31051464203483 2.53723230004362 2.26529637832404 2.55474987327647 3.57306216647089
Code
styled = (
    interest.style
    .format({'2015': '{:.4}%', '2016': '{:.4}%', '2017': '{:.4}%', '2018': '{:.4}%', '2019': '{:.4}%'})
    .set_caption('Real Interest Rate (%)')
    .set_table_styles([{"selector": "th", "props": [("text-align","center")]}, 
                       {"selector": "td", "props": [("text-align","left")]}]))
styled
Table 1: Real Interest Rate (%)
  Country Name 2015 2016 2017 2018 2019
87 Australia 6.23% 6.12% 1.53% 3.32% 1.57%
335 China 4.20% 2.83% 0.19% 0.85% 2.99%
1655 United States 2.31% 2.53% 2.26% 2.55% 3.57%

Final Reflection:

The first graph shows the hospital beds per capita over a 5 year period. 3 countries remained stagnant while China saw increases over the time period and France declined. The annotations on the graph specify how much change happened in China and France over the 5 year period. I decided to use +/- signs as well as color coding to indicate the positive or negative value of change. I also placed the values in between the datapoints to simply show which two years is being compared.

The next graph is a simple, sorted dotplot of life expectancy. I added text labels to each datapoint to show the exact values for each country. This is not a major change but it does help with comparing countries that are close together on the graph, allowing viewers to see the decimal values.

The styled table has a few key components that make it a proper visualization. First, the data values are rounded and have an appended percentage sign to help you read and understand what the datatype is. Second, everything is aligned to make the table look more organized and increase readability further. Lastly, I added a title to show you what the table is representing. These three small details are what makes the table adequate compared to the raw, unedited version.