Given the following data frame:
State,City,Population,Poverty_Rate,Median_Age,
VA,XYZ,.,10.5%,42,
MD,ABC,"12,345",8.9%,.,
NY,.,987,654,.,41,
...
import pandas as pd
df = pd.read_csv("/path... /sample_data")
df.dtypes
returns
State Object
City Object
Population Object
Proverty_Rate Object
Median_Age Object
I attempt to convert the data type of appropriate columns to int or float:
df = df.astype({"Population": int, "Proverty_rate": float, "Median_Age": int })
I received
Value Error: invalid literal for int() with base 10: '12,345'
I suspect the comma separator is causing this problem. How can I remove those from my dataset?