19

What are the advantages and disadvantages of storing Python objects in a numpy.array with dtype='o' versus using list (or list of list, etc., in higher dimensions)?

Are numpy arrays more efficient in this case? (It seems that they cannot avoid the indirection, but might be more efficient in the multidimensional case.)

Neil G
  • 32,138
  • 39
  • 156
  • 257
  • 1
    I personally have never come across compelling uses for NumPy arrays of *objects*. It would be interesting to see if someone can come up with a convincing example. (+1) – NPE Apr 11 '13 at 10:08
  • Possible duplicate of http://stackoverflow.com/questions/6141853/numpy-array-of-python-objects – tiago Apr 11 '13 at 12:29
  • 2
    @NPE I mostly agree, other than as a convenient way to do (some) mathematical operations with arrays of `Fraction` objects (or the like), without resorting to half a dozen nested `zip`'s and `map`'s. – Jaime Apr 11 '13 at 14:08
  • @NPE [NodePy](http://numerics.kaust.edu.sa/nodepy/) uses NumPy arrays of objects (sympy expressions) to do some exact analysis of numerical methods for ODEs. – jorgeca Apr 11 '13 at 17:54

2 Answers2

12

Slicing works differently with NumPy arrays. The NumPy docs devote a lengthy page on the topic. To highlight some points:

  • NumPy slices can slice through multiple dimensions
  • All arrays generated by NumPy basic slicing are always views of the original array, while slices of lists are shallow copies.
  • You can assign a scalar into a NumPy slice.
  • You can insert and delete items in a list by assigning a sequence of different length to a slice, whereas NumPy would raise an error.

Demo:

>>> a = np.arange(4, dtype=object).reshape((2,2))
>>> a
array([[0, 1],
       [2, 3]], dtype=object)
>>> a[:,0]             #multidimensional slicing
array([0, 2], dtype=object)
>>> b = a[:,0]
>>> b[:] = True        #can assign scalar
>>> a                  #contents of a changed because b is a view to a
array([[True, 1],
       [True, 3]], dtype=object)    

Also, NumPy arrays provide convenient mathematical operations with arrays of objects that support them (e.g. fraction.Fraction).

Neil G
  • 32,138
  • 39
  • 156
  • 257
Janne Karila
  • 24,266
  • 6
  • 53
  • 94
0

Numpy is less memory use than a Python list. Also Numpy is faster and more convenient than a list.

For example: if you want to add two lists in Python you have to do looping over all element in the list. On the other hand in Numpy you just add them.

# adding two lists in python
sum = []
l1 = [1, 2, 3]
l2 = [2, 3, 4]
for i in range(len(l1)):
   print sum.append(l1[i]+l2[i])
# adding in numpy
a1 = np.arange(3)
a2 = np.arange(3)
sum2 = a1+a2
print sum2
Unheilig
  • 16,196
  • 193
  • 68
  • 98
Mohammed Awney
  • 1,292
  • 16
  • 21
  • I wasn't asking about arrays of numbers. I was asking about arrays of objects. And I don't know that Python lists need more memory or are slower. After all, they're both just contiguous arrays underneath. – Neil G Apr 20 '17 at 01:53