numpy array dtype is coming as int32 by default in a windows 10 64 bit machine

Question

I have installed Anaconda 3 64 bit on my laptop and written the following code in Spyder:

import numpy.distutils.system_info as sysinfo
import numpy as np
import platform

sysinfo.platform_bits 
platform.architecture()

my_array = np.array([0,1,2,3])
my_array.dtype

Output of these commands show the following:

sysinfo.platform_bits 
Out[31]: 64

platform.architecture()
Out[32]: ('64bit', 'WindowsPE')

my_array = np.array([0,1,2,3])
my_array.dtype
Out[33]: dtype('int32')

My question is that even though my system is 64bit, why by default the array type is int32 instead of int64?

Any help is appreciated.

Might want to check[\[SO\]: \_csv.Error: field larger than field limit (131072) (@CristiFati's answer)](https://stackoverflow.com/a/54517228/4788546) out (although very different, same cause). — CristiFati, Oct 08 '22 at 08:21

score 25 · Answer 1 · answered Mar 29 '16 at 08:50

25

Default integer type np.int_ is C long:

http://docs.scipy.org/doc/numpy-1.10.1/user/basics.types.html

But C long is int32 in win64.

https://msdn.microsoft.com/en-us/library/9c3yd98k.aspx

This is kind of a weirdness of the win64 platform.

answered Mar 29 '16 at 08:50

Stop harming Monica

12,141
1
36
56

Warren Weckesser · Answer 2 · 2016-03-31T03:54:50.830

16

In Microsoft C, even on a 64 bit system, the size of the long int data type is 32 bits. (See, for example, https://msdn.microsoft.com/en-us/library/9c3yd98k.aspx.) Numpy inherits the default size of an integer from the C compiler's long int.

edited Mar 31 '16 at 03:54

answered Mar 29 '16 at 08:37

Warren Weckesser

110,654
19
194
214

The size of `int` is 32bit as well on Linux – Severin Pappadeux Mar 31 '16 at 03:36
1

@SeverinPappadeux Thanks--I should have been more explicit about the C data type used as the default numpy integer. I've updated my answer. On a 64 bit Linux system, a `long int` is 64 bits. – Warren Weckesser Mar 31 '16 at 03:51
is there some reason not to use np.int64 then? – endolith Apr 06 '16 at 00:04

score 7 · Accepted Answer · answered Jan 09 '18 at 15:28

Original poster, Prana, asked a very good question. "Why is the integer default set to 32-bit, on a 64-bit machine?"

As near as I can tell, the short answer is: "Because it was designed wrong". Seems obvious, that a 64-bit machine should default-define an integer in any associated interpreter as 64 bit. But of course, the two answers explain why this is not the case. Things are now different, and so I offer this update.

What I notice is that for both CentOS-7.4 Linux and MacOS 10.10.5 (the new and the old), running Python 2.7.14 (with Numpy 1.14.0 ), (as at January 2018), the default integer is now defined as 64-bit. (The "my_array.dtype" in the initial example would now report "dtype('int64')" on both platforms.

Using 32-bit integers as the default integer in any interpreter can result in very squirrelly results if you are doing integer math, as this question pointed out:

Using numpy to square value gives negative number

It appears now that Python and Numpy have been updated and revised (corrected, one might argue), so that in order to replicate the problem encountered as described in the above question, you have to explicitly define the Numpy array as int32.

In Python, on both platforms now, default integer looks to be int64. This code runs the same on both platforms (CentOS-7.4 and MacOSX 10.10.5):

>>> import numpy as np
>>> tlist = [1, 2, 47852]
>>> t_array = np.asarray(tlist)
>>> t_array.dtype

dtype('int64')

>>> print t_array ** 2

[ 1 4 2289813904]

But if we make the t_array a 32-bit integer, one gets the following, because of the integer calculation rolling over the sign bit in the 32-bit word.

>>> t_array32 = np.asarray(tlist, dtype=np.int32)
>>> t_array32.dtype

dtype*('int32')

>>> print t_array32 ** 2

[ 1 4 -2005153392]

The reason for using int32 is of course, efficiency. There are some situations (such as using TensorFlow or other neural-network machine learning tools), where you want to use 32-bit representations (mostly float, of course), as the speed gains versus using 64-bit floats, can be quite significant.

Numpy always used 64-bit integers by default on 64-bit Linux and OSX, the problem is only Windows `:)` — user7138814, Jan 09 '18 at 19:23
Thanks, @gemesyscanada. My versions of Python and Numpy are 3.7.3 and 1.16.5 respectively. I am still experiencing the same problem. Any suggestions or comments to fix the problem? — Erkan Hatipoglu, Sep 20 '19 at 11:00

score 1 · Answer 4 · edited Jan 24 '23 at 16:16

1

You can explicitly cast the array to the needed data type, like so:

int64_array = int32_array.astype(np.int64)

edited Jan 24 '23 at 16:16

MisterMiyagi

44,374
10
104
119

answered Dec 08 '20 at 16:11

Misa

109
4

score 1 · Answer 5 · answered Oct 07 '21 at 00:42

1

You can create the array with the data type set to int64. E.g.,

#Windows uses int32 by default, but if we want int64, we can tell it to
x = np.array([1, 2, 3, 4, 5], dtype=np.int64)

answered Oct 07 '21 at 00:42

Ginzorf

769
11
19

numpy array dtype is coming as int32 by default in a windows 10 64 bit machine

5 Answers5

Linked

Related