Repeating numpy values and specifying dtype

Question

I want to generate a numpy array of the form:

0.5*[[0, 0], [1, 1], [2, 2], ...]

I want the final array to have a dtype of numpy.float32.

Here is my attempt:

>>> import numpy as np
>>> N = 5
>>> x = np.array(np.repeat(0.5*np.arange(N), 2), np.float32)
>>> x
array([ 0. ,  0. ,  0.5,  0.5,  1. ,  1. ,  1.5,  1.5,  2. ,  2. ], dtype=float32)

Is this a good way? Can I avoid the copy (if it is indeed copying) just for type conversion?

Saullo G. P. Castro · Accepted Answer · 2020-03-27T21:06:24.683

4

You only has to reshape your final result to obtain what you want:

x = x.reshape(-1, 2)

You could also run arange passing the dtype:

x = np.repeat(0.5*np.arange(N, dtype=np.float32), 2).reshape(-1, 2)

You can easily cast the array as another type using the astype method, which accepts an argument copy:

x.astype(np.int8, copy=False)

But, as explained in the documentation, numpy checks for some requirements in order to return the view. If those requirements are not satisfied, a copy is returned.

You can check if a given array is a copy or a view from another by checking the OWNDATA attribute, accessible through the flags property of the ndarray.

EDIT: more on checking if a given array is a copy...

Is there a way to check if numpy arrays share the same data?

edited Mar 27 '20 at 21:06

answered Nov 06 '13 at 13:39

Saullo G. P. Castro

56,802
26
179
234

1

checking for view/copy: http://stackoverflow.com/questions/11286864/is-there-a-way-to-check-if-numpy-arrays-share-the-same-data – ev-br Nov 06 '13 at 13:50
2

You could also change `0.5*np.arange(N, dtype=np.float32)` to `np.arange(0, 0.5*N, 0.5, dtype=np.float32)` to avoid a temporary array. – Warren Weckesser Nov 06 '13 at 14:24
2

Using arange with a non-exact data type is generally not good (it is just prone to floating point inaccuracies). It is probably better multiply by 0.5 afterwards. There are probably very few cases where this kind of thing is speed relevant anyway. – seberg Nov 06 '13 at 14:28
1

@seberg: That's generally good advice. In most cases I use `np.linspace` for exactly that reason. In this case, however, `N` is an integer, so `0.5*N` is exact (unless `N` is huge), and `arange` is fine. (I also agree that avoiding the temporary is a micro-optimization and not too important.) – Warren Weckesser Nov 06 '13 at 14:33

Lee · Answer 2 · 2013-11-06T16:19:59.753

An alternative:

 np.array([0.5*np.arange(N, dtype=np.float32)]*2)

Gives:

array([[ 0. ,  0.5,  1. ,  1.5,  2. ],
       [ 0. ,  0.5,  1. ,  1.5,  2. ]], dtype=float32)

You might want to rotate it:

np.rot90(np.array([0.5*np.arange(N, dtype=np.float32)]*2),3)

Giving:

array([[ 0. ,  0. ],
       [ 0.5,  0.5],
       [ 1. ,  1. ],
       [ 1.5,  1.5],
       [ 2. ,  2. ]], dtype=float32)

Note, this is slower than @Saullo_Castro's answer:

np.rot90(np.array([0.5*np.arange(N, dtype=np.float32)]*2),3)

10000 loops, best of 3: 24.3us per loop

np.repeat(0.5*np.arange(N, dtype=np.float32), 2).reshape(-1, 2)

10000 loops, best of 3: 9.23 us per loop

np.array(np.repeat(0.5*np.arange(N), 2), np.float32).reshape(-1, 2)

10000 loops, best of 3: 10.4 us per loop

(using %%timeit on ipython)

Repeating numpy values and specifying dtype

2 Answers2

Linked