1st colunn : Weapon
2nd column : Pepetrator_Age
What i am trying to find is which weapon is popular in which age.
For example i am trying to draw a similar graph like this:
For example y axis should be number of cases x axis age of Perpetrator
and lines are weapon type that Perpetrator used
You can copy paste this to jupyter to initialize dataset
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
data = pd.read_csv("hdb.csv", low_memory=False)
cols = data.columns
cols = cols.map(lambda x: x.replace(' ', '_'))
data.columns = cols
#clear the unnecessary data here
data = data.drop(['Agency_Code', 'Victim_Ethnicity', 'Agency_Name','Agency_Type', 'Perpetrator_Ethnicity', 'Victim_Count', 'Perpetrator_Count'], axis=1)
data = data[data.Perpetrator_Age != "0"]
data = data[data.Perpetrator_Age != ""]
data = data[data.Perpetrator_Age != " "]
data = data[data.Victim_Sex != "Unknown"]
data = data[data.Victim_Race != "Unknown"]
data = data[data.Perpetrator_Sex != "Unknown"]
data = data[data.Perpetrator_Race != "Unknown"]
data = data[data.Relationship != "Unknown"]
data = data[data.Weapon != "Unknown"]
data
Data set here: https://www.kaggle.com/jyzaguirre/us-homicide-reports