0

I am using the astype method of numpy's array object to convert an array from character to integer. For the sake of efficiency I am using copy=False, but I noticed that the original array is actually not modified. How is that possible without declaring b as a new array ?

import numpy

a=numpy.array(['0','1']) 
b=a.astype(numpy.int32,copy=False)
print a[0], b[0]
titus
  • 452
  • 4
  • 17
  • Compare `a.nbytes` and `b.nbytes`. One is 2, the other 8. You, and numpy, can't squeeze 2 4byte integers into 2 1byte character slots. – hpaulj Feb 04 '16 at 18:29

4 Answers4

2

There are two obvious problems here. First, the

numpy.array(a)

in

b=numpy.array(a).astype(numpy.int32,copy=False)

already makes a copy. Second, from the docs:

If [the copy parameter] is set to false, and the dtype, order, and subok requirements are satisfied, the input array is returned instead of a copy.

NumPy is simply ignoring copy=False, since the dtype mismatch means it needs to copy.

user2357112
  • 260,549
  • 28
  • 431
  • 505
1

From the docs, a copy is only avoided if,

copy : bool, optional

By default, astype always returns a newly allocated array. If this is set to false, and the dtype, order, and subok requirements are satisfied, the input array is returned instead of a copy.

Because you have a character array and it needs to convert to an int32, I assume this violates the dtype requirements.

Community
  • 1
  • 1
Ed Smith
  • 12,716
  • 2
  • 43
  • 55
1

I found a convoluted way to do an in-place type conversion

https://stackoverflow.com/a/4396247/901925

In that example the conversion was from 'int32' to 'float32'. Adapted to our case:

Initial x with a large enough string size:

In [128]: x=np.array(['0','1'],dtype='S4')
In [129]: x.__array_interface__['data']
Out[129]: (173756800, False)    # data buffer location

now make a view, and copy values from x to the view:

In [130]: y=x.view(int)
In [131]: y[:]=x

Same data buffer location (same for y)

In [132]: x.__array_interface__['data']
Out[132]: (173756800, False)

Now y is two ints:

In [133]: y
Out[133]: array([0, 1])

x, which is still S4, looks at these bytes in a different way:

In [134]: x
Out[134]: 
array([b'', b'\x01'], 
      dtype='|S4')

So it is possible to perform data type conversions in-place, if byte sizes match, but it is an advanced operation. Both the person asking that question and the one answering it are numpy experts.

And astype,copy=False is mentioned in another answer, but fails for the same reason as here - it can't perform the conversion without changing the original array.

Community
  • 1
  • 1
hpaulj
  • 221,503
  • 14
  • 230
  • 353
0

You should read the docs:

If copy is set to false, and the dtype, order, and subok requirements are satisfied, the input array is returned instead of a copy.

So your dtype requirements are not satisified. The array must be copied

Daniel
  • 42,087
  • 4
  • 55
  • 81