0

pandas column has 0.0(nan) and 0(nan). I want to get 0 for both cases. Followed is the code.

import pandas as pd
import re

df = pd.DataFrame.from_dict({'col1': ['0.0(nan)','0(nan)']})
df['col2'] = df['col1'].astype(str).apply(lambda x: re.sub('(.*?)\(nan\)', '\\1', re.sub('(.*?)\.0*\(nan\)', '\\1', x)))
print(df)

Below is the output. For the regex, I didn't know how to deal with either .0 or 0 before the (. This is why I used re.sub inside another re.sub. My question is how to make the regex in one re.sub. Or any other methods? Thank you.

       col1 col2
0  0.0(nan)    0
1    0(nan)    0

Edit: by the comment of @mozway

df['col2'] = df['col1'].astype(str).apply(lambda x: re.sub('(.*?)(?:\.0)?\(nan\)', '\\1', x))
warem
  • 1,471
  • 2
  • 14
  • 21

0 Answers0