5

I have a dataframe that holds a number of NoneType values and I would like to drop all columns where all the row values AND the header is None. I am struggling to find a way to do this. In the MWE below I have managed to either drop all columns where all the rows are None OR drop all columns where the header is None.

from __future__ import annotations

import pandas as pd

d = [[1, 2, None, None, None], [4, 5, None, None, 7]]
cols = ['a', 'b', 'c', None, None]
df = pd.DataFrame(data=d, columns=cols)

print("Original: \n", df)
#Original: 
#    a  b     c   NaN  NaN
#0  1  2  None  None  NaN
#1  4  5  None  None  7.0

print("\nDropped how = all: \n", df.dropna(axis=1, how="all"))    # Drops column 'c'
#Dropped how = all: 
#    a  b  NaN
#0  1  2  NaN
#1  4  5  7.0

print("\nDropped None columns: \n", df[df.columns.dropna()])
#Dropped None columns: 
#    a  b     c
#0  1  2  None
#1  4  5  None

How can I drop only the columns I want to drop and get this?

#Wanted: 
#    a  b     c  NaN
#0  1  2  None   NaN
#1  4  5  None   7.0
Kajsa
  • 409
  • 6
  • 16
  • Does this answer your question? [Pandas: drop columns with all NaN's](https://stackoverflow.com/questions/45147100/pandas-drop-columns-with-all-nans) – meph Jan 13 '20 at 16:32

3 Answers3

8

You can use 2 conditions with an & and invert and use .loc[]:

df.loc[:,~(df.columns.isna() & df.isna().all())]

   a  b     c  NaN
0  1  2  None  NaN
1  4  5  None  7.0
anky
  • 74,114
  • 11
  • 41
  • 70
5

Here's one way using notna and any and performing boolean indexing on the dataframe based on the conditions:

df.loc[:, (df.notna() | df.columns.notna()).any(0)]

   a  b     c  NaN
0  1  2  None  NaN
1  4  5  None  7.0
yatu
  • 86,083
  • 12
  • 84
  • 139
4

IIUC

df.loc[:,df.T.reset_index().notna().any(1).values]
   a  b     c  NaN
0  1  2  None  NaN
1  4  5  None  7.0
BENY
  • 317,841
  • 20
  • 164
  • 234