8

I'm trying to use numpy to element-wise square an array. I've noticed that some of the values appear as negative numbers. The squared value isn't near the max int limit. Does anyone know why this is happening and how I could fix it? I'd rather avoid using a for loop to square an array element-wise, since my data set is quite large.

Here's an example of what is happening:

import numpy as np

test = [1, 2, 47852]
sq = np.array(test)**2
print(sq)
print(47852*47852)

Output:

[1,4, -2005153392]
2289813904
suzep136
  • 89
  • 1
  • 2
  • That's very strange; I get the correct answer when I enter the code. Have you tried using numpy's `square` function? `sq = np.square(np.array(test))` Does the issue still happen? – nrlakin Jan 20 '17 at 23:39
  • This is very rare, I have had no problems: [ 1 4 2289813904] 2289813904 – eyllanesc Jan 20 '17 at 23:41
  • Even weirder; both methods work on my laptop, but both methods fail on my Raspberry Pi (I get the same result as you on the Pi). – nrlakin Jan 20 '17 at 23:41
  • What architecture is your pc?, 32 bits or 64bits – eyllanesc Jan 20 '17 at 23:43
  • @Mitch's answer below is correct. Oddly, both my laptop (which works) and the Pi (which doesn't) use 64bit processors; laptop is x86, and the pi uses an ARM Cortex-A53. Wouldn't have occurred to me to worry about overflow errors on a 64bit core--glad I saw this post. – nrlakin Jan 20 '17 at 23:48

1 Answers1

28

This is because NumPy doesn't check for integer overflow - likely because that would slow down every integer operation, and NumPy is designed with efficiency in mind. So when you have an array of 32-bit integers and your result does not fit in 32 bits, it is still interpreted as 32-bit integer, giving you the strange negative result.

To avoid this, you can be mindful of the dtype you need to perform the operation safely, in this case 'int64' would suffice.

>>> np.array(test, dtype='int64')**2
2289813904

You aren't seeing the same issue with Python int's because Python checks for overflow and adjusts accordingly to a larger data type if necessary. If I recall, there was a question about this on the mailing list and the response was that there would be a large performance implication on atomic array ops if the same were done in NumPy.

As for why your default integer type may be 32-bit on a 64-bit system, as Goyo answered on a related question, the default integer np.int_ type is the same as C long, which is platform dependent but can be 32-bits.

Community
  • 1
  • 1
miradulo
  • 28,857
  • 6
  • 80
  • 93
  • How does your answer extend to floats? If I do `-6.19318182**2` I get `-38.35550`. – n1k31t4 Mar 26 '18 at 17:28
  • 1
    @DexterMorgan What do you expect? If you are doing so literally as you wrote, this makes sense - BEDMAS :) – miradulo Mar 26 '18 at 18:28
  • 1
    Haha - I was obviously in a maths mood and not a programming one... that was clearly just a negative 6 to me ;) Thanks for pointing it out! – n1k31t4 Mar 26 '18 at 19:39