3

Trying to subset a datatable by values that don't match a list:

DT1 = dt.Frame(A = ['a', 'b', 'c', 'd'])
sel_rows = functools.reduce(operator.or_,(f.A != obs for obs  in ['a', 'b']))
DT1[sel_rows, :]

However this returns all the rows,

I'd expect only the rows with only 'c' and 'd' in column A to returned.

Why is everything returned? How do I edit this to have that behavior.

Rafael
  • 3,096
  • 1
  • 23
  • 61

1 Answers1

2

Solved it by changing operator.or to operator.and...

DT1 = dt.Frame(A = ['a', 'b', 'c', 'd'])
sel_rows = functools.reduce(operator.and_,(f.A != obs for obs in ['a', 'b']))
DT1[sel_rows, :]
Rafael
  • 3,096
  • 1
  • 23
  • 61
  • In my case the list of items is too large, so this approach results in python kernel die. My workaround is to create temporary Frame with the list as a key and a column with any value for all rows, then left join my Frame with temporary Frame and filter all rows that are is not NA: https://stackoverflow.com/a/76980383/2404234 – Kanarsky Aug 26 '23 at 12:08