28

Is there a simple way in NumPy to flatten type object array?

I know .flatten() method flattens non-object type arrays constructed from same size arrays:

I1 a = np.array([[1],[2],[3]])

I2 a.flatten()
O2 array([1, 2, 3])

however, I can't get dtype=object array flattened:

I4 b
O4 array([[1], [2, 3], [3]], dtype=object)

I5 b.flatten()
O5 array([[1], [2, 3], [3]], dtype=object)

Thanks.

Gökhan Sever
  • 8,004
  • 13
  • 36
  • 38

2 Answers2

61

if you want [1,2,3,3], try this then

np.hstack(b)
nye17
  • 12,857
  • 11
  • 58
  • 68
  • 3
    Nice. I was about to post this (which does the exact same thing): [x for bb in b for x in bb] – Oriol Nieto Jul 06 '12 at 18:07
  • 1
    @urinieto actually the list comprehension-based method you posted is faster, although it's kinda nicer to have it settled in numpy's way. – nye17 Jul 06 '12 at 18:16
  • how about for an array of 20k element? – Gökhan Sever Jul 06 '12 at 18:24
  • @nye17 -- But at the end of the day, it's often nice to have a numpy array instead of a list. – mgilson Jul 06 '12 at 18:25
  • @GökhanSever 20k wouldn't be a problem for modern computers, if you are really thresholded by speed in this kind of computation, I would say that you shouldn't have had an inhomogenous data array to begin with. – nye17 Jul 06 '12 at 18:32
  • @mgilson true, although `np.array([x for bb in b for x in bb])` will do the job. – nye17 Jul 06 '12 at 18:33
  • @nye17, For me, readability beats speed, since plot creation usually takes most of the time. np.hstack(b) is 12 characters total, whereas the latter is more than twice of this. For inhomogeneity argument, that's the nature of data I have. hstack helps me to bring them together so that I can perform bulk statistics between pair of different calculations. – Gökhan Sever Jul 06 '12 at 20:04
  • @GökhanSever great, then `hstack` is the way to go! – nye17 Jul 06 '12 at 20:08
0

In case when your array does not contain more than one nested array, np.hstack(arr) function won't work!

Workaround:

arr = np.array([[0]])
if arr.any():
    arr = np.hstack(arr)
else:
    arr = arr.flatten()
jedrix
  • 63
  • 5