0

I have a DataFrame

Date      Open   Close
20190101  23.00  0
20190102  0      0
20190103  19     18
20190104  21     19

I first turn all the NaN to zeros, then I am going to change all the zeros to the numbers based on the non zeros using interpolate(limit_direction='both'). Prior to doing this I would like to count how many zeros there are in the whole dataframe to check how much data corruption there is.

I cannot seem to find it. I believe its along the lines of turning them to boolean and counting that way but have been unsuccessful so far.

EDIT: (df == 0).sum().sum() worked perfect thanks

James
  • 332
  • 2
  • 3
  • 9
  • 5
    `(df == 0).sum()` will summarize the number of zeroes in each column. `(df == 0).sum().sum()` will show the total number of zeroes in the whole dataframe – Stuart Jan 23 '20 at 21:02
  • Do you mean `df[['Open', 'Close']].eq(0).sum().sum()` ? – Jon Clements Jan 23 '20 at 21:02
  • 2
    `print(np.count_nonzero(df==0))` – Andrej Kesely Jan 23 '20 at 21:05
  • 2
    Does this answer your question? [Python Pandas Counting the Occurrences of a Specific value](https://stackoverflow.com/questions/35277075/python-pandas-counting-the-occurrences-of-a-specific-value) – AMC Jan 23 '20 at 21:23

1 Answers1

0

Assuming Pandas DF.

Filtering the column may be useful.

df.Open[df.Open==0].count()

If you want to count every 0 in the dataframe, you can loop over on the columns.

import pandas as pd
dct = {'Date':[20190101 ,20190102 ,20190103  ,20190104  ],
'Open':[23,0,19,21],
'Close':[0,0,18,19]}
df = pd.DataFrame(dct)

df.Open[df.Open==0].count()

looping over on the Columns

num = 0
for i in df.columns:
   num += df[i][df[i]==0].count()
num