Why does numpy array's astype method not modify the input when casting type?

Question

I am using the astype method of numpy's array object to convert an array from character to integer. For the sake of efficiency I am using copy=False, but I noticed that the original array is actually not modified. How is that possible without declaring b as a new array ?

import numpy

a=numpy.array(['0','1']) 
b=a.astype(numpy.int32,copy=False)
print a[0], b[0]

Compare `a.nbytes` and `b.nbytes`. One is 2, the other 8. You, and numpy, can't squeeze 2 4byte integers into 2 1byte character slots. — hpaulj, Feb 04 '16 at 18:29

score 2 · Accepted Answer · answered Feb 04 '16 at 17:38

2

There are two obvious problems here. First, the

numpy.array(a)

in

b=numpy.array(a).astype(numpy.int32,copy=False)

already makes a copy. Second, from the docs:

If [the copy parameter] is set to false, and the dtype, order, and subok requirements are satisfied, the input array is returned instead of a copy.

NumPy is simply ignoring copy=False, since the dtype mismatch means it needs to copy.

answered Feb 04 '16 at 17:38

user2357112

260,549
28
431
505

right, consider the numpy.array(a) as a huge typo :) – titus Feb 04 '16 at 17:58

score 1 · Answer 2 · edited Jun 20 '20 at 09:12

1

From the docs, a copy is only avoided if,

copy : bool, optional

By default, astype always returns a newly allocated array. If this is set to false, and the dtype, order, and subok requirements are satisfied, the input array is returned instead of a copy.

Because you have a character array and it needs to convert to an int32, I assume this violates the dtype requirements.

edited Jun 20 '20 at 09:12

Community

1
1

answered Feb 04 '16 at 17:38

Ed Smith

12,716
2
43
55

score 1 · Answer 3 · edited May 23 '17 at 11:45

I found a convoluted way to do an in-place type conversion

https://stackoverflow.com/a/4396247/901925

In that example the conversion was from 'int32' to 'float32'. Adapted to our case:

Initial x with a large enough string size:

In [128]: x=np.array(['0','1'],dtype='S4')
In [129]: x.__array_interface__['data']
Out[129]: (173756800, False)    # data buffer location

now make a view, and copy values from x to the view:

In [130]: y=x.view(int)
In [131]: y[:]=x

Same data buffer location (same for y)

In [132]: x.__array_interface__['data']
Out[132]: (173756800, False)

Now y is two ints:

In [133]: y
Out[133]: array([0, 1])

x, which is still S4, looks at these bytes in a different way:

In [134]: x
Out[134]: 
array([b'', b'\x01'], 
      dtype='|S4')

So it is possible to perform data type conversions in-place, if byte sizes match, but it is an advanced operation. Both the person asking that question and the one answering it are numpy experts.

And astype,copy=False is mentioned in another answer, but fails for the same reason as here - it can't perform the conversion without changing the original array.

score 0 · Answer 4 · answered Feb 04 '16 at 17:38

0

You should read the docs:

If copy is set to false, and the dtype, order, and subok requirements are satisfied, the input array is returned instead of a copy.

So your dtype requirements are not satisified. The array must be copied

answered Feb 04 '16 at 17:38

Daniel

42,087
4
55
81

Why does numpy array's astype method not modify the input when casting type?

4 Answers4