Converting String formatted Column into Integer

Question

I'm reading in a csv file that has a column title 'Funding' showing the total amount of funding a company received during their start-up. The column is formatted as a string and includes something similar to the following using pandas:

# create ID column ranging from 1 to 10
id_col = list(range(1, 11))

# create Funding column
funding_col = ['$5M', '$20M', '$5M', 'Unknown', '$20M', 'Unknown', 'Unknown', '$5M', '$20M', 'Unknown']

# create dictionary with column names as keys and column data as values
data = {'ID': id_col, 'Funding': funding_col}

# create DataFrame from dictionary
df = pd.DataFrame(data)

# print DataFrame
print(df)

My question is: How can I convert the 'Funding' column into an integer? How do I deal with the 'Unknown' values? Notice that the Funding is in $M or $B.

I tried stripping the $ and B or M using the .strip function. I also tried to coerce into numeric format using something similar to this:

df['A'] = pd.to_numeric(df['A'], errors='coerce')

coercing generated NaN values for the entire 'Funding' column — zacramer, May 12 '23 at 00:21
Welcome to Stack Overflow! Check out the [tour] and [How to ask a good question](/help/how-to-ask) for tips. Have you done any research? I did a vague google, `pandas convert number suffix`, and found this, which looks helpful: [Convert the string 2.90K to 2900 or 5.2M to 5200000 in pandas dataframe](https://stackoverflow.com/q/39684548/4518341) — wjandrea, May 12 '23 at 00:27

Converting String formatted Column into Integer

0 Answers0