I'm reading in a csv file that has a column title 'Funding' showing the total amount of funding a company received during their start-up. The column is formatted as a string and includes something similar to the following using pandas:
# create ID column ranging from 1 to 10
id_col = list(range(1, 11))
# create Funding column
funding_col = ['$5M', '$20M', '$5M', 'Unknown', '$20M', 'Unknown', 'Unknown', '$5M', '$20M', 'Unknown']
# create dictionary with column names as keys and column data as values
data = {'ID': id_col, 'Funding': funding_col}
# create DataFrame from dictionary
df = pd.DataFrame(data)
# print DataFrame
print(df)
My question is: How can I convert the 'Funding' column into an integer? How do I deal with the 'Unknown' values? Notice that the Funding is in $M or $B.
I tried stripping the $ and B or M using the .strip function. I also tried to coerce into numeric format using something similar to this:
df['A'] = pd.to_numeric(df['A'], errors='coerce')