1

I have a large database in which I need to drop entries that don't satisfy a boolean criteria, but the criteria may involve several dozen columns.

I have the following which works with copying and pasting the names

df = df[~(  (df['FirstCol'] > df['SecondCol']) |
            (df['ThirdCol'] > df['FifthCol']) |
             ...
            (df['FiftiethCol'] > df['TweniethCol']) |
            (df['ThisCouldBeHundredsCol'] > df['LastOne'])
        )]

However, I want to be able to do this in shorter amounts of code. If I have the column names that need to be compared in a list, like so

list_of_comparison_cols = ['FirstCol', 'SecondCol', 'ThirdCol', 'FifthCol', ..., 'FiftiethCol', 'TweniethCol', 'ThisCouldBeHundredsCol', 'LastOne']

How would I go about doing this in as little code and more dynamically as possible?

Many thanks.

  • Does this answer your question? [Pythonic way to check if a list is sorted or not](https://stackoverflow.com/questions/3755136/pythonic-way-to-check-if-a-list-is-sorted-or-not) – Acccumulation Jun 18 '20 at 19:21

1 Answers1

2

You can do it by selecting every two elements of your list with [::2] to get ['FirstCol', 'ThirdCol',...] and [1::2] to get ['SecondCol', 'FifthCol', .... Use it to select the columns and compare to_numpy arrays between both side of the inequality. Then use any over axis=1 that correspond to the | used in your condition.

#example
list_of_comparison_cols = ['FirstCol', 'SecondCol', 'ThirdCol', 'FifthCol', 
                           'FiftiethCol', 'TweniethCol', 'ThisCouldBeHundredsCol', 
                           'LastOne']
np.random.seed(0)
df = pd.DataFrame(np.random.randint(0,50,8*10).reshape(10,8), 
                  columns=list_of_comparison_cols)
# create the mask
mask = (df[list_of_comparison_cols[::2]].to_numpy()
        >df[list_of_comparison_cols[1::2]].to_numpy()
       ).any(1)
print (df[~mask])
   FirstCol  SecondCol  ThirdCol  FifthCol  FiftiethCol  TweniethCol  \
0        44         47         0         3            3           39   

   ThisCouldBeHundredsCol  LastOne  
0                       9       19  
Ben.T
  • 29,160
  • 6
  • 32
  • 54