0

I make program using pandas and openpyxl to manipulate excel files, series of data is:

l=[466629703, NA, 527821349, NA,734823364, NA,1667241489, NA,502673377, NA,491316417, NA,505520276, NA,2840580259, NA,1399526794, NA,468709318, NA,425220764, NA,409771252, NA,643692418, NA,1193809483, NA,353829950, NA,424820400, NA,406999623, NA,389293014, NA,1168972722, NA,420654309, NA,390431735, NA,356588382, NA]

excel data

deposit_sum = sep_df[sep_kward][deposit].dropna().astype(int).sum()

The result has to be 16188926398

But 11200862491 is the result of above code. Only one of file occurs that error. What do you think is the problem?

Charlie Clark
  • 18,477
  • 4
  • 49
  • 55

1 Answers1

1

Don't typecast values to int after dropping NaN's convert the value to int64 because this 2840580259.0 is out of range for integer value:

deposit_sum =df[0].dropna().astype('int64').sum()
#deposit_sum =sep_df[sep_kward][deposit].dropna().astype('int64').sum()

output of deposit_sum:

16188926398

Sample dataframe used:

NA=float('NaN')
l=[466629703, NA, 527821349, NA,734823364, NA,1667241489, NA,502673377, NA,491316417, NA,505520276, NA,2840580259, NA,1399526794, NA,468709318, NA,425220764, NA,409771252, NA,643692418, NA,1193809483, NA,353829950, NA,424820400, NA,406999623, NA,389293014, NA,1168972722, NA,420654309, NA,390431735, NA,356588382, NA]
df=pd.DataFrame(l)
Anurag Dabas
  • 23,866
  • 9
  • 21
  • 41