Is there a more sophisticated way to check if a dataframe df
contains 2 columns named Column 1
and Column 2
:
if numpy.all(map(lambda c: c in df.columns, ['Column 1', 'Columns 2'])):
do_something()
Is there a more sophisticated way to check if a dataframe df
contains 2 columns named Column 1
and Column 2
:
if numpy.all(map(lambda c: c in df.columns, ['Column 1', 'Columns 2'])):
do_something()
I know it's an old post...
From this answer:
if set(['Column 1', 'Column 2']).issubset(df.columns):
do_something()
or little more elegant:
if {'Column 1', 'Column 2'}.issubset(df.columns):
do_something()
You can use Index.isin
:
df = pd.DataFrame({'A':[1,2,3],
'B':[4,5,6],
'C':[7,8,9],
'D':[1,3,5],
'E':[5,3,6],
'F':[7,4,3]})
print (df)
A B C D E F
0 1 4 7 1 5 7
1 2 5 8 3 3 4
2 3 6 9 5 6 3
If need check at least one value use any
cols = ['A', 'B']
print (df.columns.isin(cols).any())
True
cols = ['W', 'B']
print (df.columns.isin(cols).any())
True
cols = ['W', 'Z']
print (df.columns.isin(cols).any())
False
If need check all
values:
cols = ['A', 'B', 'C','D','E','F']
print (df.columns.isin(cols).all())
True
cols = ['W', 'Z']
print (df.columns.isin(cols).all())
False
The one issue with the given answer (and maybe it works for the OP) is that it tests to see if all of the dataframe's columns are in a given list - but not that all of the given list's items are in the dataframe columns.
My solution was:
test = all([ i in df.columns for i in ['A', 'B'] ])
Where test
is a simple True
or False
Also to check the existence of a list items in a dataframe columns, and still using isin
, you can do the following:
col_list = ['A', 'B']
pd.index(col_list).isin(df.columns).all()
As explained in the accepted answer, .all()
is to check if all items in col_list
are present in the columns, while .any()
is to test the presence
of any of them.