2

I made my self an example:

import numpy as np
arrays = [np.random.rand(3,4) for _ in range(10)]
arr1 = np.array(arrays)
print(arr1.shape)
arr2 = np.stack(arrays, axis=0)
print(arr2.shape)

I found that arr1 and arr2 have the same shape and content. So are these two methods (np.array() and np.stack(..., axis=0) equivalent?

captainst
  • 617
  • 1
  • 7
  • 20

2 Answers2

3

In general, you should get something similar from the two, but there will be some edge cases. For example, passing ragged lists to np.array will give an np.array of lists, but np.stack will raise an exception:

In [119]: np.stack([[1,2], [4,5,6]], axis=0)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-119-2ba66e6300d6> in <module>()
----> 1 np.stack([[1,2], [4,5,6]], axis=0)

<__array_function__ internals> in stack(*args, **kwargs)

    423     shapes = {arr.shape for arr in arrays}
    424     if len(shapes) != 1:
--> 425         raise ValueError('all input arrays must have the same shape')
    426
    427     result_ndim = arrays[0].ndim + 1

ValueError: all input arrays must have the same shape

In [120]: np.array([[1,2], [4,5,6]])
Out[120]: array([list([1, 2]), list([4, 5, 6])], dtype=object)
Randy
  • 14,349
  • 2
  • 36
  • 42
2

np.array is compiled, and something of a black box. It works predictably for lists of lists of numbers, provided it can make multidimensional array. The fall back is an object dtype array (array of lists), or in some cases an error (Creating numpy array problem ( could not broadcast input array from shape (2) into shape (1) )).

np.stackis a cover function for np.concatenate. It's python code.

First it makes all inputs arrays:

arrays = [asanyarray(arr) for arr in arrays]

asanyarray is just a no-copy call to np.array.

Then it checks the shapes - they must all be the same

shapes = {arr.shape for arr in arrays}

then it adds a dimension to each (details omitted)

expanded_arrays = [arr[sl] for arr in arrays]

and concatenates them:

concatenate(expanded_arrays, axis=axis, out=out)

Starting from a list of arrays, it does about the same amount of work as np.array (timings are similar for large lists). As the other answer noted, it won't give a ragged array. But where stack is really nice is when we want to use axis. Without that we'd have to transpose the array version.

hpaulj
  • 221,503
  • 14
  • 230
  • 353