get operation on list as new column in pandas containing list or strings

Question

Assume this df

df = pd.DataFrame({'a':['texttext',[1,2,3,4,5],[2,3,4,5]],
                   'b':['texttext',[1,2,5,8,9,10],[2,3,5]]})

I would like to get three extra columns that get: a) the common values of the list intercetcion b) the values en list of column a not in column b c) the values in column c not in column a

Note that the df might contain (like in row 1) other NON LIST values. That makes things complicated

I now how to make the operations for the lists:

common = [x for x in lst1 if x in lst2]
minus = [x for x in lst1 if x not in lst2]
plus = [x for x in lst2 if x not in lst1]

But I am not able to figure out how to implement it in pandas. even for a method (for .apply) I have to sent two values In a one liner I have to check the type.

Some idea?

Thanks a lot

EDIT: Expected output:

expected = pd.DataFrame({'a':['texttext',[1,2,3,4,5],[2,3,4,5]],
                     'b':['texttext',[1,2,5,8,9,10],[2,3,5]],
                    'common':['',[2,5],[2,3,5]],
                    'minus':['',[3,4,5],[4]],
                    'plus':['',[ 8, 9, 10],[]]})

Your table is weird because first line is a string, and the others are list. — igorkf, Mar 04 '21 at 17:02

score 0 · Answer 1 · answered Mar 04 '21 at 16:59

0

For what's about using two columns as an input to the apply function you can check this question. To check if your object is a list, use the builtin

isinstance(your_variable,list)

This should be the right building blocks for your problem.

answered Mar 04 '21 at 16:59

Arthur Tondereau

86
3

score 0 · Accepted Answer · answered Mar 04 '21 at 17:45

Let us define a function to test the membership of lists in column a with the corresponding lists in column b:

def test_membership():
    for a, b in zip(df['a'], df['b']):
        if isinstance(a, list) and isinstance(b, list):
            a, b = set(a), set(b)
            yield list(a & b), list(a - b), list(b - a)
        else:
            yield '', '', ''

df[['common', 'minus', 'plus']] = list(test_membership())

                 a                    b     common   minus        plus
0         texttext             texttext                               
1  [1, 2, 3, 4, 5]  [1, 2, 5, 8, 9, 10]  [1, 2, 5]  [3, 4]  [8, 9, 10]
2     [2, 3, 4, 5]            [2, 3, 5]  [2, 3, 5]     [4]          []

get operation on list as new column in pandas containing list or strings

2 Answers2