446

What is the difference between NumPy's np.array and np.asarray? When should I use one rather than the other? They seem to generate identical output.

Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135
Benjamin Hodgson
  • 42,952
  • 15
  • 108
  • 157

8 Answers8

316

The definition of asarray is:

def asarray(a, dtype=None, order=None):
    return array(a, dtype, copy=False, order=order)

So it is like array, except it has fewer options, and copy=False. array has copy=True by default.

The main difference is that array (by default) will make a copy of the object, while asarray will not unless necessary.

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • 26
    So when should we use each? If creating an array from scratch, which is better, `array([1, 2, 3])` or `asarray([1, 2, 3])`? – endolith Jun 02 '14 at 23:25
  • 22
    @endolith: `[1, 2, 3]` is a Python list, so a copy of the data must be made to create the `ndarary`. So use `np.array` directly instead of `np.asarray` which would send the `copy=False` parameter to `np.array`. The `copy=False` is ignored if a copy must be made as it would be in this case. If you benchmark the two using `%timeit` in IPython you'll see a difference for small lists, but it hardly matters which you use for large lists. – unutbu Jun 02 '14 at 23:43
  • 4
    That makes sense per the method names too: "asarray": Treat this as an array (inplace), i.e., you're sort of just changing your view on this list/array. "array": Actually convert this to a new array. – Jesper - jtk.eth May 04 '16 at 18:41
  • 2
    how about `np.asanyarray`? – wsdzbm Jul 26 '16 at 16:29
  • 4
    @Lee: `asarray` always returns an `ndarray`. `asanyarray` will return a subclass of `ndarray` if that is what was passed to it. For example, an `np.matrix` is a subclass of `ndarray`. So `np.asanyarray(np.matrix(...))` returns the same matrix, whereas `np.asarray(np.matrix(...))` converts the matrix to an `ndarray`. – unutbu Jul 26 '16 at 16:34
  • In your example, `np.asanyarray(np.matrix(...))` returns the original matrix (or view), whereas `np.asarray(np.matrix(...))` makes a new copy as the `ndarray` returned? – wsdzbm Jul 26 '16 at 16:40
  • @Lee: Yes, `np.asarray(np.matrix(...))` returns a copy. You can easily test this -- Let `x = np.matrix([1])` and `y = np.asarray(np.matrix([1]))`. Set `y[0,0] = 100`. If `y` is a view of `x` then assignment to `y` would modify `x`. But you'll find that `x` remains the same. So you can conclude that `y` is a copy of `x`. – unutbu Jul 26 '16 at 17:34
  • When `arr1 = np.asarray([1,2])` `arr2 = np.array(arr1)` `arr3 = np.asarray(arr1)` , we see the difference as follows: `arr1 is arr2` `>> False` `arr1 is arr3` `>> True` – Change-the-world Nov 23 '17 at 04:33
  • @unutbu in your response to @ endolith's comment, you said "*in IPython you'll see a difference for small lists, but it hardly matters which you use for large lists.*" Shouldn't it be other way around? I mean it should be "*in IPython you'll see a difference for **larger** lists, but it hardly matters which you use for **smaller** lists.*" right? – Milan Mar 16 '21 at 00:40
224

Since other questions are being redirected to this one which ask about asanyarray or other array creation routines, it's probably worth having a brief summary of what each of them does.

The differences are mainly about when to return the input unchanged, as opposed to making a new array as a copy.

array offers a wide variety of options (most of the other functions are thin wrappers around it), including flags to determine when to copy. A full explanation would take just as long as the docs (see Array Creation, but briefly, here are some examples:

Assume a is an ndarray, and m is a matrix, and they both have a dtype of float32:

  • np.array(a) and np.array(m) will copy both, because that's the default behavior.
  • np.array(a, copy=False) and np.array(m, copy=False) will copy m but not a, because m is not an ndarray.
  • np.array(a, copy=False, subok=True) and np.array(m, copy=False, subok=True) will copy neither, because m is a matrix, which is a subclass of ndarray.
  • np.array(a, dtype=int, copy=False, subok=True) will copy both, because the dtype is not compatible.

Most of the other functions are thin wrappers around array that control when copying happens:

  • asarray: The input will be returned uncopied iff it's a compatible ndarray (copy=False).
  • asanyarray: The input will be returned uncopied iff it's a compatible ndarray or subclass like matrix (copy=False, subok=True).
  • ascontiguousarray: The input will be returned uncopied iff it's a compatible ndarray in contiguous C order (copy=False, order='C').
  • asfortranarray: The input will be returned uncopied iff it's a compatible ndarray in contiguous Fortran order (copy=False, order='F').
  • require: The input will be returned uncopied iff it's compatible with the specified requirements string.
  • copy: The input is always copied.
  • fromiter: The input is treated as an iterable (so, e.g., you can construct an array from an iterator's elements, instead of an object array with the iterator); always copied.

There are also convenience functions, like asarray_chkfinite (same copying rules as asarray, but raises ValueError if there are any nan or inf values), and constructors for subclasses like matrix or for special cases like record arrays, and of course the actual ndarray constructor (which lets you create an array directly out of strides over a buffer).

Patol75
  • 4,342
  • 1
  • 17
  • 28
abarnert
  • 354,177
  • 51
  • 601
  • 671
  • Just to correct, Numpy's ndarray now has float64 as default dtype. – Mohith7548 Nov 28 '20 at 17:59
  • 1
    In the first section, in the 4th point, you actually meant --- *"`np.array(a, dtype=int, copy=False, subok=True)` **and `np.array(m, dtype=int, copy=False, subok=True)`** will copy both, because the `dtype` is not compatible."* --- right? Thanks in advance! – Milan Mar 16 '21 at 00:24
156

The difference can be demonstrated by this example:

  1. Generate a matrix.

     >>> A = numpy.matrix(numpy.ones((3, 3)))
     >>> A
     matrix([[ 1.,  1.,  1.],
             [ 1.,  1.,  1.],
             [ 1.,  1.,  1.]])
    
  2. Use numpy.array to modify A. Doesn't work because you are modifying a copy.

     >>> numpy.array(A)[2] = 2
     >>> A
     matrix([[ 1.,  1.,  1.],
             [ 1.,  1.,  1.],
             [ 1.,  1.,  1.]])
    
  3. Use numpy.asarray to modify A. It worked because you are modifying A itself.

     >>> numpy.asarray(A)[2] = 2
     >>> A
     matrix([[ 1.,  1.,  1.],
             [ 1.,  1.,  1.],
             [ 2.,  2.,  2.]])
    
Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135
Bobbie Wu
  • 1,661
  • 1
  • 10
  • 6
15

The differences are mentioned quite clearly in the documentation of array and asarray. The differences lie in the argument list and hence the action of the function depending on those parameters.

The function definitions are :

numpy.array(object, dtype=None, copy=True, order=None, subok=False, ndmin=0)

and

numpy.asarray(a, dtype=None, order=None)

The following arguments are those that may be passed to array and not asarray as mentioned in the documentation :

copy : bool, optional If true (default), then the object is copied. Otherwise, a copy will only be made if __array__ returns a copy, if obj is a nested sequence, or if a copy is needed to satisfy any of the other requirements (dtype, order, etc.).

subok : bool, optional If True, then sub-classes will be passed-through, otherwise the returned array will be forced to be a base-class array (default).

ndmin : int, optional Specifies the minimum number of dimensions that the resulting array should have. Ones will be pre-pended to the shape as needed to meet this requirement.

asheeshr
  • 4,088
  • 6
  • 31
  • 50
6

asarray(x) is like array(x, copy=False)

Use asarray(x) when you want to ensure that x will be an array before any other operations are done. If x is already an array then no copy would be done. It would not cause a redundant performance hit.

Here is an example of a function that ensure x is converted into an array first.

def mysum(x):
    return np.asarray(x).sum()
off99555
  • 3,797
  • 3
  • 37
  • 49
2

Let's understand the difference between np.array() and np.asarray() with the example:

np.array(): Converts input data (list, tuple, array, or another sequence type) to a ndarray and copies the input data by default.

np.asarray(): Converts input data to a ndarray but does not copy if the input is already a ndarray.

# Create an array...
arr = np.ones(5);  # array([1., 1., 1., 1., 1.])
# Now I want to modify `arr` with `array` method. Let's see...
np.array(arr)[3] = 200;  # array([1., 1., 1., 1., 1.])

No change in the array because we are modifying a copy of the array, arr.

Now, modify arr with asarray() method.

np.asarray(arr)[3] = 200;  # array([1., 200, 1., 1., 1.])

The change occurs in this array because we are working with the original array now.

Michael M.
  • 10,486
  • 9
  • 18
  • 34
Haroon Hayat
  • 329
  • 2
  • 4
1

Here's a simple example that can demonstrate the difference.

The main difference is that array will make a copy of the original data and using different object we can modify the data in the original array.

import numpy as np
a = np.arange(0.0, 10.2, 0.12)
int_cvr = np.asarray(a, dtype = np.int64)

The contents in array (a), remain untouched, and still, we can perform any operation on the data using another object without modifying the content in original array.

vivek
  • 563
  • 7
  • 16
0

Difference

  1. np.array(): Converts input data like list, tuple, etc. to ndarray and copies the input data by default. This creates redundant object in memory.
  2. np.asarray(): Converts input data to ndarray but does not copy if the input is already ndarray. This is more memory efficient.
import numpy as np

print("NumPy version:", np.__version__)
NumPy version: 1.22.3

Case 1: Using np.array() when input is ndarray.

# STEP 1: Initialize source.
src1 = np.ones(5)
print("Data type:", type(src1))
print("Values:\n", src1)

# STEP 2: Convert to `ndarray`.
arr1 = np.array(src1)               # np.array() is used.
print("\nData type:", type(arr1))
print("Values:\n", arr1)

# STEP 3: Compare source with converted `ndarray`.
print("\nIs Source & new NumPy array same?\n", src1 is arr1)

Output

Data type: <class 'numpy.ndarray'>
Values:
 [1. 1. 1. 1. 1.]

Data type: <class 'numpy.ndarray'>
Values:
 [1. 1. 1. 1. 1.]

Is Source & new NumPy array same?
 False

Case 2: Using np.asarray() when input is ndarray.

# STEP 1: Initialize source.
src2 = np.ones(5)
print("Data type:", type(src2))
print("Values:\n", src2)

# STEP 2: Convert to `ndarray`.
arr2 = np.asarray(src2)             # np.asarray() is used.
print("\nData type:", type(arr2))
print("Values:\n", arr2)

# STEP 3: Compare source with converted `ndarray`.
print("\nIs Source & new NumPy array same?\n", src2 is arr2)

Output

Data type: <class 'numpy.ndarray'>
Values:
 [1. 1. 1. 1. 1.]

Data type: <class 'numpy.ndarray'>
Values:
 [1. 1. 1. 1. 1.]

Is Source & new NumPy array same?
 True

Hence by comparing two outputs we can conclude that:
When using np.asarray() on ndarray, the source ndarray and converted ndarray are pointing to same object in the memory.

Dheemanth Bhat
  • 4,269
  • 2
  • 21
  • 40