Quick Pandas question:
I cleaning up the values in individual columns of a dataframe by using an apply on a series:
# For all values in col 'Rate' over 1, divide by 100
df['rate'][df['rate']>1] = df['rate'][df['rate']>1].apply(lambda x: x/100)
This is fine when the selection criteria is simple, such as df['rate']>1
. This however gets very long when you start adding multiple selection criteria:
df['rate'][(df['rate']>1) & (~df['rate'].isnull()) & (df['rate_type']=='fixed) & (df['something']<= 'nothing')] = df['rate'][(df['rate']>1) & (df['rate_type']=='fixed) & (df['something']<= 'nothing')].apply(lambda x: x/100)
What's the most concise way to: 1. Split a column off (as a Series) from a DataFrame 2. Apply a function to the items of the Series 3. Update the DataFrame with the modified series
I've tried using df.update()
, but that didn't seem to work. I've also tried using the Series as a selector, e.g. isin(Series)
, but I wasn't able to get that to work either.
Thank you!