7

I noticed that the de facto standard for array manipulation in Python is through the excellent numpy library. However, I know that the Python Standard Library has an array module, which seems to me to have a similar use-case as Numpy.

Is there any actual real-world example where array is desirable over numpy or just plain list?

From my naive interpretation, array is just memory-efficient container for homogeneous data, but offers no means of improving computational efficiency.


EDIT

Just out of curiosity, I searched through Github and import array for Python hits 186'721 counts, while import numpy hits 8'062'678 counts.

However, I could not find a popular repository using array.

norok2
  • 25,683
  • 4
  • 73
  • 99
  • 4
    "From my naive interpretation, array is just memory-efficient container for homogeneous data, but offers no means of improving computational efficiency." That's basically all their is to it. If you want an efficient way to store a one-dimensional homogenous array you can use `array`, but that's about all it's good for (sometimes I have found it useful for this though). Numpy provides more powerful N-dimensional arrays and vectorized arithmetic operations and linear algebra, etc. – Iguananaut Jul 11 '18 at 17:04
  • OK, but is it used in some actual application? Like, I do not know, data serialization to disk, or something. – norok2 Jul 11 '18 at 18:15
  • I have used it before as a quick way to read/write an array of ints to disk, for example, yes. – Iguananaut Jul 11 '18 at 18:16

3 Answers3

8

To understand the differences between numpy and array, I ran a few more quantitative test.

What I have found is that, for my system (Ubuntu 18.04, Python3), array seems to be twice as fast at generating a large array from the range generator compared to numpy (although numpy's dedicated np.arange() seems to be much faster -- actually too fast, and perhaps it is caching something during tests), but twice as slow than using list.

However, quite surprisingly, array objects seems to be larger than the numpy counterparts. Instead, the list objects are roughly 8-13% larger than array objects (this will vary with the size of the individual items, obviously). Compared to list, array offers a way to control the size of the number objects.

So, perhaps, the only sensible use case for array is actually when numpy is not available.

For completeness, here is the code that I used for the tests:

import numpy as np
import array
import sys

num = int(1e6)
num_i = 100
x = np.logspace(1, int(np.log10(num)), num_i).astype(int)

%timeit list(range(num))
# 10 loops, best of 3: 32.8 ms per loop

%timeit array.array('l', range(num))
# 10 loops, best of 3: 86.3 ms per loop

%timeit np.array(range(num), dtype=np.int64)
# 10 loops, best of 3: 180 ms per loop

%timeit np.arange(num, dtype=np.int64)
# 1000 loops, best of 3: 809 µs per loop


y_list = np.array([sys.getsizeof(list(range(x_i))) for x_i in x])
y_array = np.array([sys.getsizeof(array.array('l', range(x_i))) for x_i in x])
y_np = np.array([sys.getsizeof(np.array(range(x_i), dtype=np.int64)) for x_i in x])

import matplotlib.pyplot as plt

plt.figure(figsize=(12, 6))
plt.plot(x, y_list, label='list')
plt.plot(x, y_array, label='array')
plt.plot(x, y_np, label='numpy')
plt.legend()
plt.show()

Plot of Object Sizes

norok2
  • 25,683
  • 4
  • 73
  • 99
  • just for information about using of getsizeof() https://stackoverflow.com/questions/449560/how-do-i-determine-the-size-of-an-object-in-python – Igor Fomenko Mar 27 '20 at 15:56
0

Yes, if you don't want another dependency in your code.

minmax
  • 493
  • 3
  • 10
  • 1
    Can you make a real-world example where you could avoid `numpy` dependency without sacrificing functionality? – norok2 Jul 11 '18 at 17:50
0

Do you mean a real world example in 2018 or 2002? NumPy started as an extension to Python, not as part of the core library. So that's why it is the way it is, not because of any performance tradeoffs.

https://scipy.github.io/old-wiki/pages/History_of_SciPy.html

You seem to be asking a specific version of a more general question. Perhaps if you thought about it more generally, you'd see that your question might not be very useful. Specific modules exist to offer features not available in the core language. There is no surprise if they are more optimized, have more features, or even if they are used more.

It doesn't mean you have to use them, nor is it (much) of a problem to leave them in.

I guess I don't understand why you're asking this question, it would help us try to provide you a better answer.

Jeff Ellen
  • 540
  • 2
  • 8
  • 1
    I understand it may be there for historical reasons. I was just reviewing all the modules from the standard library and checking what are they good for in 2018 and whether is worth knowing them or not. My understanding is that the Standard Library should not contain libraries that are not useful. – norok2 Jul 11 '18 at 21:56
  • Why do they have to be useful for anything other than backwards compatibility? But if your question is "whether it's worth knowing", then I'd say for array, your answer is probably 'no'. But if you're suggestion is "it should be removed", then I'd disagree and say there's not much harm in leaving it (until there's some difficult change that causes it to be a headache to be updated) – Jeff Ellen Jul 12 '18 at 03:18