5

I have a simple question which relates to similar questions here, and here.

I am trying to drop all columns from a pandas dataframe, which have only zeroes (vertically, axis=1). Let me give you an example:

df = pd.DataFrame({'a':[0,0,0,0], 'b':[0,-1,0,1]})

    a   b
0   0   0
1   0  -1
2   0   0
3   0   1

I'd like to drop column asince it has only zeroes.

However, I'd like to do it in a nice and vectorized fashion if possible. My data set is huge - so I don't want to loop. Hence I tried

df = df.loc[(df).any(1), (df!=0).any(0)]

    b
1  -1
3   1

Which allows me to drop both columns and rows. But if I just try to drop the columns, locseems to fail. Any ideas?

cs95
  • 379,657
  • 97
  • 704
  • 746
Rachel
  • 1,937
  • 7
  • 31
  • 58

3 Answers3

15

You are really close, use any - 0 are casted to Falses:

df = df.loc[:, df.any()]
print (df)

   b
0  0
1  1
2  0
3  1
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Great answer! But @coldspeed was the first to answer it. Though I timed it, and your solution seems to be the fastest! Thank you! – Rachel Aug 17 '17 at 10:45
  • You are right. This time, it is really difficult. Coldspeed helped me set the question right and aswered it. So I think Coldspeed deserves the checked answer this time. What is the policy here? Don't find anything in Beta... – Rachel Aug 17 '17 at 10:49
  • 1
    @Rachel - if need faster solution, need maxu or my answer, because double transpose is slow. And who is winner - who get accepted answer - is up to you. – jezrael Aug 17 '17 at 10:57
6

If it's a matter of 0s and not sum, use df.any:

In [291]: df.T[df.any()].T
Out[291]: 
   b
0  0
1 -1
2  0
3  1

Alternatively:

In [296]: df.T[(df != 0).any()].T # or df.loc[:, (df != 0).any()]
Out[296]: 
   b
0  0
1 -1
2  0
3  1
cs95
  • 379,657
  • 97
  • 704
  • 746
5
In [73]: df.loc[:, df.ne(0).any()]
Out[73]:
   b
0  0
1  1
2  0
3  1

or:

In [71]: df.loc[:, ~df.eq(0).all()]
Out[71]:
   b
0  0
1  1
2  0
3  1

If we want to check those that do NOT sum up to 0:

In [78]: df.loc[:, df.sum().astype(bool)]
Out[78]:
   b
0  0
1  1
2  0
3  1
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419