16

I have the following dataframe:

     col
0    pre
1    post
2    a
3    b
4    post
5    pre
6    pre

I want to replace all rows in the dataframe which do not contain 'pre' to become 'nonpre', so dataframe looks like:

     col
0    pre
1    nonpre
2    nonpre
3    nonpre
4    nonpre
5    pre
6    pre

I can do this using a dictionary and pandas replace, however I want to just select the elements which are not 'pre' and replace them with 'nonpre'. is there a better way to do that without listing all possible col values in a dictionary?

user308827
  • 21,227
  • 87
  • 254
  • 417

2 Answers2

27

As long as you're comfortable with the df.loc[condition, column] syntax that pandas allows, this is very easy, just do df['col'] != 'pre' to find all rows that should be changed:

df['col2'] = df['col']
df.loc[df['col'] != 'pre', 'col2'] = 'nonpre'

df
Out[7]: 
    col    col2
0   pre     pre
1  post  nonpre
2     a  nonpre
3     b  nonpre
4  post  nonpre
5   pre     pre
6   pre     pre
Marius
  • 58,213
  • 16
  • 107
  • 105
  • thanks! is there any issue with using .loc I should be wary of? – user308827 Nov 25 '14 at 02:49
  • 1
    No, `.loc` is basically what you should be trying first when you want to get at a particular set of rows and columns in your dataframe. Not sure if you have experience with R, but it works very similarly to the subsetting syntax for R dataframes. – Marius Nov 25 '14 at 02:51
6
df[df['col'].apply(lambda x: 'pre' not in x)] = 'nonpre'
Mike
  • 6,813
  • 4
  • 29
  • 50