2

I have a dataframe. I want to replace values of all columns of some rows to a default value. Is there a way to do this via pandas apply function

Here is the dataframe

import pandas as pd
temp=pd.DataFrame({'a':[1,2,3,4,5,6],'b':[2,3,4,5,6,7],'c':['p','q','r','s','t','u']})
mylist=['p','t']

How to replace values in columns a and bto default value 0,where value of column c is in mylist

Is there a way to do this using pandas functionality,avoiding for loops

NG_21
  • 685
  • 2
  • 13
  • 22

2 Answers2

3

Use isin to create a boolean mask and use loc to set the rows that meet the condition to the desired new value:

In [37]:
temp.loc[temp['c'].isin(mylist),['a','b']] = 0
temp

Out[37]:
   a  b  c
0  0  0  p
1  2  3  q
2  3  4  r
3  4  5  s
4  0  0  t
5  6  7  u

result of the inner isin:

In [38]:
temp['c'].isin(mylist)

Out[38]:
0     True
1    False
2    False
3    False
4     True
5    False
Name: c, dtype: bool
EdChum
  • 376,765
  • 198
  • 813
  • 562
  • Thanks @EdChum. But does this work efficiently with around 500 columns and 1mn rows – NG_21 Jul 21 '16 at 09:20
  • 1
    it should do, essentially you're using the boolean mask to select the rows of interest, you pass a list of columns you want to replace the values for, this will be much faster than using `apply` – EdChum Jul 21 '16 at 09:21
1

NumPy based method would be to use np.in1d to get such a mask and use it like so -

mask = np.in1d(temp.c,mylist)
temp.ix[mask,temp.columns!='c'] = 0

This will replace in all columns except 'c'. If you are looking to replace in specific columns, say 'a' and 'b', edit the last line to -

temp.ix[mask,['a','b']] = 0
Divakar
  • 218,885
  • 19
  • 262
  • 358