0

Can anyone help me understand why list comprehension generates different results when I just changed series to list?

ser1 = pd.Series([1, 2, 3, 4, 5])
ser2 = pd.Series([4, 5, 6, 7, 8])
[i for i in ser1 if i not in ser2]
# the output is [5]

but if I change to loop through list inside list comprehension, I get the result I want:

l1 = [1, 2, 3, 4, 5]
l2 = [4, 5, 6, 7, 8]
[i for i in l1 if i not in l2]
# the output is [1, 2, 3]

Why series generates wrong answer?

Thanks in advance.

Fananan
  • 19
  • 4

1 Answers1

1

For a pandas series, the in operator refers to the keys (indexes), not the contents...

Ah, someone just posted a link to an extensive answer; I won't recreate it here

However, one further note: depending on the situation, another way to get a similar result is with sets:

s1 = {1, 2, 3, 4, 5}
s2 = {4, 5, 6, 7, 8}
s1 - s2
# answer is {1, 2, 3} in arbitrary order; may be shuffled
Jiří Baum
  • 6,697
  • 2
  • 17
  • 17
  • Thanks a lot for your response. But the result is [5] when using series. If `in` operator points to index, then `[i for i in ser1 if i not in ser2]` should be null? – Fananan Dec 18 '21 at 13:04
  • The default index is numbers from zero; in this case, `0, 1, 2, 3, 4` – Jiří Baum Dec 19 '21 at 04:17