Why does this conditional lambda function not return the expected result?

Question

I am still noob at using python and pandas. I am working to improve on a keyword assessment. My DF looks like this

Name  Description 
Dog   Dogs are in the house
Cat   Cats are in the shed
Cat   Categories of cats are concatenated

I am using a keyword list like this ['house', 'shed', 'in']

My lambda function looks like this

keyword_agg = lambda x: ' ,'.join x if x is not 'skip me' else None

I am using a function to identify and score each row for keyword matches

def foo (df, words):
    col_list = []
    key_list= []
    for w in words:
        pattern = w
        df[w] = np.where(df.Description.str.contains(pattern), 1, 0)
        df[w +'keyword'] = np.where(df.Description.str.contains(pattern), w, 
                          'skip me')
        col_list.append(w)
        key_list.append(w + 'keyword')
    df['score'] = df[col_list].sum(axis=1)
    df['keywords'] = df[key_list].apply(keyword_agg, axis=1)

The function appends the keyword to a column using the work and then creates a 1 or 0 based on the match. The function also creates a column with 'word + keyword' and creates the word or 'skip me' for each row.

I am expecting the apply to work like this

df['keywords'] = df[key_list].apply(keyword_agg, axis=1)

Returns

Keywords
in, house
in, shed
None

Instead I am getting

Keywords
in, 'skip me' , house
in, 'skip me', shed
'skip me', 'skip me' , 'skip me'

Can someone help me explain why the 'skip me' strings are showing when I am trying to exclude them?

`is not` is identity. you want `x != "skip me"` See [Why does comparing strings in Python using either '==' or 'is' sometimes produce a different result?](https://stackoverflow.com/questions/1504717/why-does-comparing-strings-in-python-using-either-or-is-sometimes-produce) — TemporalWolf, Jul 17 '17 at 21:07
First of all, why are you using `lambda` at all? You are assigning it to a name, thereby removing the *only advantage that `lambda` has*: that it is anonymous. Second, I'm pretty sure `keyword_agg = lambda x: ' ,'.join x if x is not 'skip me' else None` is a SyntaxError. — juanpa.arrivillaga, Jul 17 '17 at 21:20

Willem Van Onsem · Answer 1 · 2017-07-17T21:24:41.127

6

The is operator (and the is not) check reference equality.

You should use the equality operator which will for most primitives checks value equality:

lambda x: ' ,'.join(x) if x != 'skip me' else None

edited Jul 17 '17 at 21:24

answered Jul 17 '17 at 21:07

Willem Van Onsem

443,496
30
428
555

Pretty sure that is a SyntaxError. – juanpa.arrivillaga Jul 17 '17 at 21:19
@MSeifert actually, I don't think that is what the op wants. I think they actually want `' ,'.join(s for s in x if s != 'skip me')` – juanpa.arrivillaga Jul 17 '17 at 21:26
2

Hard to tell, though. Their example isn't really an MCVE. – juanpa.arrivillaga Jul 17 '17 at 21:26

Why does this conditional lambda function not return the expected result?

1 Answers1