Week 8 HW

Author
Affiliation

Ben Akyrueklier

George Washington University

Published

October 27, 2025

Code
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import altair as alt
import plotly.express as px
import plotly.io as pio
import geopandas as gpd
import folium
pio.renderers.default='plotly_mimetype+notebook_connected'
import warnings
warnings.filterwarnings('ignore')
Code
gdf = gpd.read_file("data/mapUSmain.geojson")
ax = gdf.plot(color='lightgray', edgecolor='black')
gdf.crs
ax.set_axis_off()

Code
att = pd.read_csv("data/crime.csv")
att.head()
Geo_FIPS Geo_Name Geo_QName SE_T003_001 SE_T003_002 SE_T003_003 SE_T005_001 SE_T005_002 SE_T005_003 SE_T005_004 ... SE_NV008_034 SE_NV008_035 SE_NV008_036 SE_NV008_037 SE_NV008_038 SE_NV008_039 SE_NV008_040 SE_NV008_041 SE_NV008_042 SE_NV008_043
0 1 Alabama Alabama 3485.891074 414.486232 3071.404842 414.486232 5.567726 39.902033 94.527606 ... 0.123727 0.309318 2.392060 0.680500 3.175666 0.000000 10.558057 0.0 0.000000 0.000000
1 4 Arizona Arizona 3504.724961 384.729430 3119.995531 384.729430 4.530947 47.181275 91.747971 ... 3.624758 3.416780 41.536161 0.000000 24.704805 0.668500 54.549636 0.0 22.209070 38.594759
2 5 Arkansas Arkansas 3579.965945 452.876901 3127.089044 452.876901 5.596067 56.837164 63.950237 ... 0.101134 2.292365 9.742551 3.236280 20.833551 0.000000 62.972611 0.0 7.821009 10.383064
3 6 California California 2831.258295 390.603698 2440.654597 390.603698 4.381161 24.222666 125.440371 ... 0.012886 1.360737 5.662006 3.999742 7.679918 0.445848 40.961278 0.0 8.342246 7.945364
4 8 Colorado Colorado 2810.805946 304.712627 2506.093319 304.712627 2.763325 56.946906 56.480129 ... 1.064254 4.742464 37.734327 0.970898 42.850213 0.000000 140.780221 0.0 30.433920 60.289036

5 rows × 154 columns

Code
polys = gdf.copy()
g = pd.merge(polys, att, left_on='NAME', right_on='Geo_Name', how='left')
g=g[g["NAME"]!="District of Columbia"]
Code
g = g.to_crs("ESRI:102003")
g["Violent Crimes (per 100k people)"] = g["SE_T005_001"]

ax = g.plot(column="Violent Crimes (per 100k people)", cmap="OrRd", scheme="Quantiles", k=5, legend=True, legend_kwds= dict(loc='upper left', bbox_to_anchor=(1.05, 1)), edgecolor="white", linewidth=0.2)
ax.set_title("Violent Crimes (per 100k people - 2014)")
ax.set_axis_off()

Code
g["Murder Rate (per 100k people)"] = g["SE_T005_002"]

b=g.explore(
    column="Murder Rate (per 100k people)",
    tooltip=["NAME", "Murder Rate (per 100k people)"],
    cmap="YlOrBr",
    style_kwds=dict(color='black')
)
folium.Marker(
    location=[38.90003175872419, -77.0486635749773],
    tooltip="D.C. Statistics",
    popup="D.C Murder Rate: " + str(round(att.loc[att["Geo_Name"] == "District of Columbia", "SE_T005_002"].values[0], 2)),
    icon=folium.Icon(icon="home", color="red"),
).add_to(b)
b
Make this Notebook Trusted to load map: File -> Trust Notebook

Reflection:

I chose to only use the continious mainland U.S. for the visualization in order to keep things clean and simple, as well as to eliminate variance because Alaska and Hawaii are likely to have much more varied crime rates due to their many differences (like size and population density) from the other states. I also decided to remove D.C. from the attributes in the dataset since it messed with the quartile ranges of the color scheme, and since you cannot see D.C without zooming in greatly, I opted to remove the city from the data. The high crime rate per capita in D.C. is because there is no rural areas that most other states have, thus leading to disproportionately high crime rates because of the urban landscape of the city. I wanted to include D.C. somehow, so I decided to add a marker for D.C. in the 2nd plot so we can still view their murder rate without it influencing the overall color scheme. I tried to chose color schemes that are red/orange because the data I am representing is negative (crime), and thus a state with a darker shade of red is associated with a bad metric. Overall, I was able to fix many of the limitations by removing bits of information that are not the most relevent, creating a balance of information and aesthetics.