4

I have a DataFrame as below

df = pd.DataFrame({
    'x' : range(0,5),
    'y' : [[0,2],[3,4],[2,3],[3,4],[7,9]]
})

I would like to test for each row of x, if the value is in the list specified by column y

df[df.x.isin(df.y)]

so I would end up with:

enter image description here

Not sure why isin() does not work in this case

Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
PingPong
  • 355
  • 2
  • 11

3 Answers3

6

df.x.isin(df.y) checks for each element x, e.g. 0, is equal to some of the values of df.y, e.g. is 0 equal to [0,2], no, and so on.

With this, you can just do a for loop:

df[ [x in y for x,y in zip(df['x'], df['y'])] ]
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
4

Let us try explode with index loc

out = df.loc[df.explode('y').query('x==y').index.unique()]
Out[217]: 
   x       y
0  0  [0, 2]
2  2  [2, 3]
3  3  [3, 4]
BENY
  • 317,841
  • 20
  • 164
  • 234
0

Just an other solution:

result = (
    df
    .assign(origin_y = df.y)
    .explode('y')
    .query('x==y')
    .drop(columns=['y'])
    .rename({'origin_y': 'y'})
)

   x       y
0  0  [0, 2]
2  2  [2, 3]
3  3  [3, 4]
elouassif
  • 308
  • 1
  • 10