0

Context/ Question

Say I have a DataFrame like below where col2 is a string.

df = pd.DataFrame({ 'col1':[1,2,3,4,5], 'col2': ['7.7/10','8.2/10','5.8/10','9.2/10','8.9/10'] }

Whats the best way to change the string values of col2 to numeric.

E.g 7.7/10 = 0.77

Tried

I have tried to use the pd.to_numeric() method however since the column values have the / I don't think it works

df.col2 = pd.to_numeric(df.col2, downcast= 'float')
Curious
  • 325
  • 1
  • 10

1 Answers1

0

If only divisions like the in the example are involved you could do:

import pandas as pd

df = pd.DataFrame({'col1': [1, 2, 3, 4, 5], 'col2': ['7.7/10', '8.2/10', '5.8/10', '9.2/10', '8.9/10']})


def fun(x):
    a, b = x.split("/")
    return float(a) * (1 / float(b))


res = df["col2"].apply(fun)
print(res)

Output

0    0.77
1    0.82
2    0.58
3    0.92
4    0.89
Name: col2, dtype: float64

As an alternative, if more complex operations are involved you could use numexpr.evaluate:

import numexpr
import pandas as pd

df = pd.DataFrame({'col1': [1, 2, 3, 4, 5], 'col2': ['7.7/10', '8.2/10', '5.8/10', '9.2/10', '8.9/10']})
res = df["col2"].apply(numexpr.evaluate)
print(res)

Output

0    0.77
1    0.82
2    0.58
3    0.92
4    0.89
Name: col2, dtype: float64

Note that numexpr is a third-party module that needs to be installed. Finally, as a last resource, if you trust the source of the data, you could use eval:

import pandas as pd

df = pd.DataFrame({'col1': [1, 2, 3, 4, 5], 'col2': ['7.7/10', '8.2/10', '5.8/10', '9.2/10', '8.9/10']})
res = df["col2"].apply(eval)
print(res)

A safer alternative to the evil eval, can be found here.

Dani Mesejo
  • 61,499
  • 6
  • 49
  • 76