I'm trying to filter a dataframe based on the values within the multiple columns, based on a single condition, but keep other columns to which I don't want to apply the filter at all.
I've reviewed these answers, with the third being the closest, but still no luck:
- how do you filter pandas dataframes by multiple columns
- Filtering multiple columns Pandas
- Python Pandas - How to filter multiple columns by one value
Setup:
import pandas as pd
df = pd.DataFrame({
'month':[1,1,1,2,2],
'a':['A','A','A','A','NONE'],
'b':['B','B','B','B','B'],
'c':['C','C','C','NONE','NONE']
}, columns = ['month','a','b','c'])
l = ['month','a','c']
df = df.loc[df['month'] == df['month'].max(), df.columns.isin(l)].reset_index(drop = True)
Current Output:
month a c
0 2 A NONE
1 2 NONE NONE
Desired Output:
month a
0 2 A
1 2 NONE
I've tried:
sub = l[1:]
df = df[(df.loc[:, sub] != 'NONE').any(axis = 1)]
and many other variations (.all()
, [sub, :]
, ~df.loc[...]
, (axis = 0)
), but all with no luck.
Basically I want to drop any column (within the sub
list) that has all 'NONE' values in it.
Any help is much appreciated.