0

When I try to define the += in an object Class I want to modify a numpy.ndarray by reference.

I have the following two objects which are numpy.ndarray, a1 and a2:

>>> a1
array([array([ 0.04168576,  0.13111852,  0.91896599]),
       array([ 0.81658056,  0.50832376,  1.59519731]),
       array([ 0.20646088,  0.13335052,  1.19661452])], dtype=object)
>>> a2
array([array([ 0.25765112,  0.54137219,  0.26067181]),
       array([ 0.57738128,  0.45649817,  1.6323892 ]),
       array([ 0.2328858 ,  0.4922151 ,  1.00012122])], dtype=object)

In my class I want to implement a self.__iand__ and want some way to write:

a1.append(a2)

to give the equivalent of the following:

>>> np.hstack((a1,a2))
array([array([ 0.04168576,  0.13111852,  0.91896599]),
       array([ 0.81658056,  0.50832376,  1.59519731]),
       array([ 0.20646088,  0.13335052,  1.19661452]),
       array([ 0.25765112,  0.54137219,  0.26067181]),
       array([ 0.57738128,  0.45649817,  1.6323892 ]),
       array([ 0.2328858 ,  0.4922151 ,  1.00012122])], dtype=object)

and to have a1 change by reference.

I want to avoid writing

a1 = np.hstack((a1,a2))

because the actual arrays are very large and this is for a Monte Carlo application and I have got to try to keep performance very fast whenever I can.

Currently when I try to implement this I get the following error:

AttributeError: 'numpy.ndarray' object has no attribute 'append'

Other related Questions:

The following Q is useful for explaining how functions can pass by reference, but doesn't address by problem.

Whereas the answer provided by @SvenMarnach mentions this exact issue, but no solution is provided.

Many Thanks,

Community
  • 1
  • 1
oliversm
  • 1,771
  • 4
  • 22
  • 44
  • 2
    Is there any particular reason why you're using a numpy array here rather than a plain Python `list`? Object arrays are essentially glorified lists - by using the `np.object` dtype you are already giving up most of the performance benefits of using numpy arrays. – ali_m May 08 '16 at 12:37
  • Yes @ali_m. I have been using numpy arrays because they give a significant performance increase when I am generating my long lists of random numbers (crucial for the Monto Carlo simulation). Additionally there are many row and column operations that I want to do, many of which have better performance when using `np.array` rather than `list`. Originally I used lists of lists, but this gave a worse performance by a factor of 20 when I timed my operations. – oliversm May 08 '16 at 13:15
  • 2
    That may be true for the inner arrays which contain float scalars, but I *highly* doubt you would see much performance difference if you switched the outer container from an `np.object` array to a `list`. See [here](http://stackoverflow.com/a/26768305/1461210), [here](http://stackoverflow.com/a/26768083/1461210) and [here](http://stackoverflow.com/a/28284961/1461210). – ali_m May 08 '16 at 13:36

0 Answers0