I am new to Numpy and I am vectorizing a homemade function that uses numpy's cumsum() for its calculation. However I found the result has some rounding error whose culprit seems to be the cumsum(). It can be shown as follows:
import numpy as np
X = np.repeat(30000.1,114569)
X1 = X**2
Y = np.cumsum(X1)
Z =Y[1:] - Y[:-1]
print(f'Theoretical value: 900006000.01')
print(f'first member of X1: {X1[0]}')
print(f'first member of Z: {Z[0]}')
print(f'last member of Z: {Z[-1]}')
print('They are mathematically the same, but in reality they are different')
The result is:
Theoretical value: 900006000.01
first member of X1: 900006000.0099999
first member of Z: 900006000.0099999
last member of Z: 900006000.015625
They are mathematically the same, but in reality they are different
Here are my questions:
1) Are there anyways to improve the precision of cumsum?
2a) If there are, can you show me a simple example?
2b) If not, then what's the known maximum size of a float64 or maximimum length of argument vector before cumsum run into rounding errors?
3) Are there any package in python to handle calculations with high precision floating point numbers
Thank you in advance
EDIT: changed the number to fewer decimal places to emphasize the issue here. I mean the problem of rounding error appears as early as 2 decimal places, I think it is a really big problem.
EDIT2: some people pointed out that the subtraction between big numbers also contributes to the error. My another question is that is there anyway in python to handle the numerical errors that stem from the subtraction between two big numbers?