6

I have a df that looks something like:

a b c d e 0 1 2 3 5 1 4 0 5 2 5 8 9 6 0 4 5 0 0 0

I would like to output the number of numbers in column c that are not zero.

Sunderam Dubey
  • 1
  • 11
  • 20
  • 40
user5826447
  • 357
  • 1
  • 5
  • 13
  • I saw that, thanks. Unfortunately their question is different because they want it for each row, and to be divided by the sum, so none of the code that was used there is applicable to my question. – user5826447 Jan 23 '16 at 20:17
  • 1
    The first line of the top answer there reads "To count nonzero values, just do `(column!=0).sum()`, where `column` is the data you want to do it for." That seems to be exactly what you're asking ;-) – Alex Riley Jan 23 '16 at 20:22

2 Answers2

13

Use double sum:

print df
   a  b  c  d  e
0  0  1  2  3  5
1  1  4  0  5  2
2  5  8  9  6  0
3  4  5  0  0  0

print (df != 0).sum(1)
0    4
1    4
2    4
3    2
dtype: int64

print (df != 0).sum(1).sum()
14

If you need count only column c or d:

print (df['c'] != 0).sum()
2

print (df['d'] != 0).sum()
3

EDIT: Solution with numpy.sum:

print ((df != 0).values.sum())
14
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
3

Numpy's count_nonzero function is efficient for this.

np.count_nonzero(df["c"])

Stig Johan B.
  • 371
  • 2
  • 5