Why numpy.sum on dataframe columns return inf?

Question

I have a Pandas dataframe with columns of type float64

I try to compute apply sum function on some columns by numpy.sum

When I active the function np.sum(x[col_name]) I receiving the result of 'inf'

But when I check where is the 'inf' value by np.where(np.isinf(x[col_name])) I received empty results.

So, What I do wrong...

Thanks.

No, unfortunately, the data is restricted. I look on the data, the data do not contain any inf number — MAK, Apr 20 '20 at 20:30
Then I don't know what to say much more. Have a look at np.nansum: https://docs.scipy.org/doc/numpy/reference/generated/numpy.nansum.html It treats NaNs as 0 — Ralvi Isufaj, Apr 20 '20 at 20:37
Ok, After I digging in the data I found number like 1.79600000007e+308 the np is not recognized this number as nan and not as inf, — MAK, Apr 20 '20 at 20:48
What dtype is your data? Max you can do with np.float64 is 1.7976931348623157e+308 according to np.finfo() — Ralvi Isufaj, Apr 20 '20 at 20:57

score 2 · Answer 1 · answered Apr 20 '20 at 22:04

The problem appears to be that one of the numbers in your data, is bigger than the max np.float64 accepts. If you run, np.finfo(np.float64), you'll see the biggest number this dtype accepts:

Machine parameters for float64
---------------------------------------------------------------
precision =  15   resolution = 1.0000000000000001e-15
machep =    -52   eps =        2.2204460492503131e-16
negep =     -53   epsneg =     1.1102230246251565e-16
minexp =  -1022   tiny =       2.2250738585072014e-308
maxexp =   1024   max =        1.7976931348623157e+308
nexp =       11   min =        -max
--------------------------------------------------------------

According to this answer: https://stackoverflow.com/a/37272717/4014051 python objects use an arbitrary length implementation, therefore the solution would be to make the dtype of your array object. This means that your code will be slower overall, as your data are not numpy objects, but presumably it will output the correct sum.

Why numpy.sum on dataframe columns return inf?

1 Answers1