3

I am working with an excel file which I read into python as a pandas dataframe. One of the columns contains responses of how many hours a person slept. A sample column is as follows:

df['Sleep'] = [1, 2, 3, 'Blank', 4, 'Blank', '5`1/2', '`3/4']

My objective is to clean this data and get it all into a single datatype with NaN for Blanks. The blanks were taken care of using:

df['Sleep'] = df.['Sleep'].replace('Blank',np.nan)

My question is how can I convert something like 5`1/2 to 5.5? All fractions in the dataset start with the backtick symbol.

aalhendi
  • 41
  • 5
  • Similar, but not the same question: https://stackoverflow.com/questions/1806278/convert-fraction-to-float – Stefan Oct 13 '20 at 07:59

3 Answers3

2

We have to use loc with fillna (because you have mixed types) and pd.eval

m = df['Sleep'].str.contains('`', na=False)
df.loc[m, 'Sleep'] = df.loc[m, 'Sleep'].str.replace('`', '+').apply(pd.eval)
df['Sleep'] = pd.to_numeric(df['Sleep'], errors='coerce')

   Sleep
0      1
1      2
2      3
3  Blank
4      4
5  Blank
6    5.5
7   0.75
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
Erfan
  • 40,971
  • 8
  • 66
  • 78
  • This works however the resulting value is still a string. Is there a way to have it be returned and saved as float? Or do I have to loop through and convert? – aalhendi Oct 13 '20 at 08:19
  • @aalhendi - add `df['Sleep'] = pd.to_numeric(df['Sleep'], errors='coerce')` – jezrael Oct 13 '20 at 08:23
1

Like this:

df[df['Sleep'].str.contains("`")] = eval(df['Sleep'].str.replace("`","+"))
Wasif
  • 14,755
  • 3
  • 14
  • 34
1
def convert_to_float(frac_str):
    try:
        return float(frac_str)
    except ValueError:
        num, denom = frac_str.split('/')
        try:
            leading, num = num.split('`')
            whole = float(leading)
        except ValueError:
            whole = 0
        frac = float(num) / float(denom)
        return whole - frac if whole < 0 else whole + frac

 df["Sleep"] = df["Sleep"].apply(lambda x: convert_to_float(x))
jgrt
  • 220
  • 2
  • 6