0

I am trying to add a new column to a dataframe with the word "foo" if found in "column1" but don't want to add it and leave the value empty if let's say the word "bar" is found. I have tried to add & to the statement below but it does not work.

import pandas as pd
import numpy as np

df = pd.read_csv('newdoc.csv')

df['new_column'] = np.where(df['column1'].str.contains("foo", case=False, na=False), 'Foo', '')
pyproper
  • 53
  • 6
  • kindly provide a sample dataframe with ur expected output. Use this a guide : https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – sammywemmy Feb 02 '20 at 22:28
  • Your code worked on a simple dataframe I tried. – DarrylG Feb 02 '20 at 22:35

2 Answers2

2

Have you tried writing a service function and then using apply()?

def check_str(mystr):
    try:
        if 'foo' in mystr and 'bar' not in mystr:
            return 'match'
        else:
            return 'no match'
    except:
        return 'no match'

df['new_column'] = df['column_1'].apply(check_str)
Matt L.
  • 3,431
  • 1
  • 15
  • 28
  • Thanks @Matt this works for csv where all cells are filled but I am getting the following error in a csv where has some empty cells: "TypeError: argument of type 'float' is not iterable" – pyproper Feb 07 '20 at 16:27
  • Yeah, that's a good point. You can handle that error in the check_str function with a try/except statement. I'll edit. – Matt L. Feb 07 '20 at 20:47
0

It's just a matter of getting regex right:

df["col1"]=df["x"].str.contains(r"^((?<!bar).)*foo(.(?!bar))*$", regex=True)

For dummy data:

import pandas as pd

df=pd.DataFrame({"x": ["foo", "asdghbat", "cjjfoo hjgbar5", "fooba", "bar jjkdfhb foojgf"], "y": [2,7,4,6,3]})

df["col1"]=df["x"].str.contains(r"^((?<!bar).)*foo(.(?!bar))*$", regex=True)

>> df

                    x  y   col1
0                 foo  2   True
1            asdghbat  7  False
2      cjjfoo hjgbar5  4  False
3               fooba  6   True
4  bar jjkdfhb foojgf  3  False

Credits - adapted from: https://social.msdn.microsoft.com/Forums/en-US/19ee0964-06b4-4b00-808a-c5be756e0459/regex-that-includes-quotword-aquot-but-does-not-contain-quotword-bquot

Grzegorz Skibinski
  • 12,624
  • 2
  • 11
  • 34