14

Following piece of code was working in numpy 1.7.1 but it is giving value error in the current version. I want to know the root cause of it.

    import numpy as np
    x = [1,2,3,4]
    y = [[1, 2],[2, 3], [1, 2],[2, 3]]

    a = np.array([x, np.array(y)])

Following is the output I get in numpy 1.7.1

>>>a
array([[1, 2, 3, 4],
       [array([1, 2]), array([2, 3]), array([1, 2]), array([2, 3])]], dtype=object)

But the same code produces error in version 1.9.2.

    ----> 5 a = np.array([x, np.array(y)])

ValueError: could not broadcast input array from shape (4,2) into shape (4) 

I have found one possible solution the this. But I don't know whether this is the best thing to do.

b= np.empty(2, dtype=object)
b[:] = [x, np.array(y)]

>>> b
array([[1, 2, 3, 4],
       array([[1, 2],
       [2, 3],
       [1, 2],
       [2, 3]])], dtype=object)

Please suggest a solution to achieve the desired output. Thanks

Manish
  • 284
  • 1
  • 4
  • 10
  • 1
    what was the result when it "worked" in `1.7.1`? What are you expecting it to do? – tmdavison Oct 02 '15 at 14:47
  • 4
    is the `a = ` line supposed to be `a = np.array(x) + np.array(y)` ? Otherwise I get `ValueError: setting an array element with a sequence.` – Rory Yorke Oct 02 '15 at 14:53
  • you should use `np.dstack` or `np.hstack` for this task – soupault Oct 02 '15 at 17:55
  • @tom The first element of the array may be names and secon can be values like co-ordinates etc. I had used numpy array earlier and It was returning a numpy array. – Manish Oct 03 '15 at 05:04
  • You should have told us right away that `x` may be names. That means the result has to be object dtype. Your example sent us on the wrong path trying to stack the lists. – hpaulj Oct 03 '15 at 16:15
  • @tom I have added the results for version 1.7.1. – Manish Oct 06 '15 at 06:28
  • Your final solution is fine if that's the matrix you want. 1.7.1 has been changed as many here have noted, so you'll have to change you're code. – kabanus Oct 22 '16 at 08:49

1 Answers1

3

What exactly are you trying to produce? I don't have a 1.7 version to test your example.

np.array(x) produces a (4,) array. np.array(y) a (4,2).

As noted in a comment, in 1.8.1 np.array([x, np.array(y)]) produces

ValueError: setting an array element with a sequence.

I can make a object dtype array, consisting of the list and the array

In [90]: np.array([x, np.array(y)],dtype=object)
Out[90]: 
array([[1, 2, 3, 4],
       [array([1, 2]), array([2, 3]), array([1, 2]), array([2, 3])]], dtype=object)

I can also concatenate 2 arrays to make a (4,3) array (x is the first column)

In [92]: np.concatenate([np.array(x)[:,None],np.array(y)],axis=1)
Out[92]: 
array([[1, 1, 2],
       [2, 2, 3],
       [3, 1, 2],
       [4, 2, 3]])

np.column_stack([x,y]) does the same thing.


Curiously in a dev 1.9 (I don't have production 1.9.2 installed) it works (sort of)

In [9]: np.__version__
Out[9]: '1.9.0.dev-Unknown'

In [10]: np.array([x,np.array(y)])
Out[10]: 
array([[        1,         2,         3,         4],
       [174420780, 175084380,  16777603,         0]])
In [11]: np.array([x,np.array(y)],dtype=object)
Out[11]: 
array([[1, 2, 3, 4],
   [None, None, None, None]], dtype=object)
In [16]: np.array([x,y],dtype=object)
Out[16]: 
array([[1, 2, 3, 4],
   [[1, 2], [2, 3], [1, 2], [2, 3]]], dtype=object)

So it looks like there is some sort of development going on.

In any case making a new array from this list and a 2d array is ambiguous. Use column_stack (assuming you want a 2d int array).


numpy 1.9.0 release notes:

The performance of converting lists containing arrays to arrays using np.array has been improved. It is now equivalent in speed to np.vstack(list).

With transposed y vstack works:

In [125]: np.vstack([[1,2,3,4],np.array([[1,2],[2,3],[1,2],[2,3]]).T])
Out[125]: 
array([[1, 2, 3, 4],
       [1, 2, 1, 2],
       [2, 3, 2, 3]])

If 1.7.1 worked, and x was string names, not just ints as in your example, then it probably was producing a object array.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • I was also getting ValueError: setting an array element with a sequence. and resolved the same by setting the dptye as object. I have a project which was developed on numpy 1.7.1 and at many places we have returned results in the above format. where first element may be name of the points and second can be the co-ordinates of each of these points. Earlier we were getting a numpy array but now we have value errors. – Manish Oct 03 '15 at 04:58
  • One possible solution is not to have numpy array, we can use list but then we will have to change implementations at a lot of place where we have treated this return value as an array. But I want to know one fix which can be applied at all the places. Thanks – Manish Oct 03 '15 at 04:58
  • I found a reference to a numpy change that probably produced your error - `np.array` has been rewritten to treat cases like your's as `vstack`. – hpaulj Oct 03 '15 at 16:13
  • I have edited my question to include the result obtained with the previous version and have found one possible solution. Can you please review the same. Thanks – Manish Oct 06 '15 at 06:27
  • http://stackoverflow.com/questions/32930449/preallocating-ndarrays - I did some time tests on creating a object array of lists. I didn't test your approach (assign the whole list to the predefined array), but I suspect it is as fast. And in this case it may be the most robust. – hpaulj Oct 06 '15 at 07:13