2

Consider the following Python code which does some simple arithmetic operations over complex numbers in Python:

import numpy as np

s = 2
l = 5
v = np.array([np.exp(1j*2*np.pi/l)])
A = pow(s*v, l) + s*v

#Print the precision of np.complex128
print np.finfo(np.complex128).precision

#Export using 20 decimal places for real and imaginary parts
np.savetxt('A.csv', A, fmt=['%.20e%+.20ej'], delimiter=',')

I saw this answer in order to print the precision of a np.complex128 variable in Python and what I get is 15. When I check the value of A in Spyder I see (32.61803398874989+1.9021130325903035j) whose imaginary part has more than 15 decimal places. Also, the exported CSV file has the value 3.26180339887498931262e+01+1.90211303259030350965e+00j whose both real and imaginary parts have 20 useful decimal places.

I'm confused here: if the precision is 15 what are those extra decimal places? According to here, np.complex128 consists of two 64-bit floats, one for the real and one for the imaginary part.

But my main issue is that I am doing many of these operations over complex numbers (like additions, multiplications and matrix inversions) in a program and each time I run I get different result, sometimes correct, sometimes wrong. It seems that my program is very sensitive to the accuracy of these operations. How is the precision over complex numbers quantified in Python and what is the maximum I can have?

I am using Python 2.7.14 and Numpy 1.14.3.

mgus
  • 808
  • 4
  • 17
  • 39

1 Answers1

1

A complete answer would take much space. For starters, you would need to read a book on numerical analysis that explains just how the binary floating point types work and are stored. But here are some partial explanations.

The 64-bit float values are stored in binary, not in decimal. So most of what is said about decimal digits must be an approximation, not the full story.

When np.finfo(np.complex128).precision (or anything else) says that there are 15 significant decimal digits, it means that you cannot count on more than those 15 digits. Here is an example, taken from my IPython console.

In [4]: 9.000000000000001
Out[4]: 9.000000000000002

In [5]: 9.000000000000001 == 9.000000000000002
Out[5]: True

Python, which by default uses the 64-bit floating point type for any number with a decimal point, treats those two 16-decimal-digit numbers as identical. When you use the number 9.000000000000001 only the first 15 decimal digits are guaranteed to be stored correctly. Anything after that is not guaranteed, and you see in this case the 1 is basically changed to a 2.

Sometimes you can get more than those 15 decimal digits. For example, since the numbers are stored in binary, a number slightly greater than 1.0 will have more binary digits after the radix point than a number like 9.0 will. This is because 9 uses 4 binary digits while 1 uses only one, so 3 more binary digits can be used after the radix point. So lets look at the imaginary part of your number, 1.9021130325903035:

In [17]: 1.902113032590303 == 1.9021130325903035
Out[17]: False

In [18]: 1.9021130325903036 == 1.9021130325903035
Out[18]: True

In [19]: 1.902113032590304 == 1.9021130325903035
Out[19]: False

We see that, although the number shows 17 decimal digits, when we change the final digit from 5 to 6 Python sees no change. But there is a change if we round that number to 16 decimal digits, either up or down. So we can say that the number is stored to 16 decimal digits and a little more. That is why when I explain precision I say that a floating point number is guaranteed to have 15 significant decimal digits but it may have a 16th, and the actual precision is slightly more than that. More briefly, there are 15 or 16 significant decimal digits.

Python has two basic ways to print a floating point number: the str function and the repr function. The str function prints things simply, so the user can understand the result without going into too much detail. The repr function gives more detail, and tries to print so much detail that the stored data can be completely determined by what is printed. Note that repr is used invisibly in some cases, such as typing a numeric value in the console or, as you saw, using the variable explorer in Spyder.

When Spyder or Python does a repr on your number 1.9021130325903035 it gives enough digits to completely define the number. As we saw above, if it showed only 16 decimal digits, dropping the final 5, the result would be slightly different from what is stored. Therefore Python prints an additional digit so you can know just what the value is. If Python printed a final 6 rather than a 5 the value would have been the same, but if Python left out that digit completely the value would be changed. So Python via repr errs on the side of giving too many digits. Despite the 17 digits printed, only 16 of them are certain.

Finally, your csv file shows 20 decimal digits because you told Python to show 20 decimal digits, due to the %.20e specifiers in your np.savetxt command. Those 20 decimal digits are not all "useful decimal places," despite what you wrote. Only the first 16 or 17 are useful in the imaginary part, depending on how you define "useful." Python basically added binary bits, all zeros, to the stored value, and printed that to 20 decimal digits. But those zero binary bits were not stored in the value.

Rory Daulton
  • 21,934
  • 6
  • 42
  • 50
  • I see your point but I have some doubts. When I say "useful" decimal places I mean that the extra digits that are printed on the CSV are non-zero so based on your answer, since the actual value remains the same one can assume they are random numbers. But why are you claiming they are zeros? – mgus May 26 '18 at 19:24
  • I did not claim those extra decimal digits that were printed are zeros; I claimed that the extra *binary* digits are that were appended to the stored value were zeros. In other words, Python pretended that the stored float value was *exact* and printed the first 20 decimal digits. The exact value of the imaginary part, printed with all decimal digits, is `1.9021130325903035096501980660832487046718597412109375`, which has 53 decimal digits. (I used the `decimal` module to get that.) The CSV got the first 20 decimal digits of that. Those added digits were not "random numbers." – Rory Daulton May 26 '18 at 19:45
  • Alright, I understood. However, my algorithm seems to work sometimes. To be precise, there are 24 permutations of the rows of a 4-by-4 complex Vandermonde matrix which I invert during the algorithm. All permutations should give the same final result in this algorithm. I won't mess with details but the thing is that for some permutations it works and for some it doesn't and I am almost sure that it's a precision issue, perhaps when calculating the inverse in those problematic cases. Do you know if I can increase the precision of complex numbers and/or complex operations in Python? – mgus May 26 '18 at 23:09
  • @mgus: The usual way to do extended floating-point precision in Python is the `decimal` module, but that does real numbers only, not complex numbers. The [mpmath module](http://mpmath.org/) does "real and complex floating-point arithmetic with arbitrary precision." It is not included in standard Python but it is included and installed by default in the Anaconda installation of Python (which is what I use). You may want to examine your matrix-inversion routine--there are many, and some are more careful with precision than others. mpmath does do matrix inversion, according to its feature list. – Rory Daulton May 26 '18 at 23:17
  • I have one more question. Where do you start counting the number of those 15 guaranteed radix points in general? Here are 2 examples `1000.123456789123456`, `1.000123456789123456e3`. These number are exactly the same but even if I change the second to last digit to the first of them my Python prints that they are equal. I am pretty confused here about those 15 radix points. It seems they are not guaranteed here. – mgus Jun 05 '18 at 06:47
  • @mgus: To count (decimal) digits in this context, count all the digits appearing in the number and ignore the decimal point and anything at or after the `e` if one is present. So both of the numbers in your last comment have 19 decimal digits. The first 16 are significant, since your first digit is one, and the 17th digit has influence on the stored number without being fully significant, but the 18th and 19th digits are definitely not significant. That's why you could change the "second to last digit" without changing the value--that is the 18th digit. – Rory Daulton Jun 05 '18 at 11:46
  • Alright, can you suggest me a book that discusses binary representations of floating numbers? I want to familiarize myself with these ideas. – mgus Jun 05 '18 at 16:50