1

This question, It's more for a discussion about the array vs list in python, and If It's worth the while changing a code base from list's fo numbers to arrays's. This is using python standard modules not numpy.

I was looking at the array module, from the python standard library, and It caught my attention, that I could very well do some "simple" numerical analysis and number crunching replacing my lists of numbers with arrays of doubles or floats, (depending on the case).

Does anyone that have experience with python's array object could share a comparison or why they choose to use them? I'm still having trouble with this decision.

My concern here, is If I can use in some way arrays of arrays, or list of arrays, and if that would boost my performance, right now I have lists of lists of numbers, and I'm trying to use python with no dependencies so no numpy.

If I'm correct the python list object internally It's a dynamic array, I'm not sure about that.

I ran this test using size = sys.getsizeof and array = array.array, I know that this may not be great comparison, but still It raises some questions.

>>> for i in range(0, 100, 5):
...     test = [1.0*j for j in range(i)]
...     a = array('f', test)
...     print(f"{i} | {size(a)} | {size(test)}")

len|array|list
---|-----|----
 0 |  32 |  36 
 5 |  52 |  68 
10 |  72 | 100 
15 |  92 | 100 
20 | 112 | 136
25 | 132 | 136
30 | 152 | 176
35 | 172 | 176
40 | 192 | 220
45 | 212 | 220
50 | 232 | 268
55 | 252 | 268
60 | 272 | 324
65 | 292 | 324
70 | 312 | 324
75 | 332 | 388
80 | 352 | 388
85 | 372 | 388
90 | 392 | 460
95 | 412 | 460
ekiim
  • 792
  • 1
  • 11
  • 25
  • 1
    Arrays are flat sequences, lists are containers. Lists store references. Lists can be nested and hold any type, even mixed types. Arrays can't. For pure numbers, arrays are fine. – timgeb Oct 26 '18 at 18:01
  • but what about 2d arrays, that should be a list of lists ? or what? or an array, but with indexing such as `i, j => i + j*(i_max)` , so I can emulate the matrix, with an array? – ekiim Oct 26 '18 at 18:13
  • for 2d arrays you'd use numpy :) – timgeb Oct 26 '18 at 19:03
  • 3
    Possible duplicate of [Python List vs. Array - when to use?](https://stackoverflow.com/questions/176011/python-list-vs-array-when-to-use) – Darkonaut Oct 26 '18 at 20:43

3 Answers3

3

Since your main concern is performance and you are dealing with numbers then Python's array module will be your answer. From the official Python 3 docs:

This module defines an object type which can compactly represent an array of basic values: characters, integers, floating point numbers. Arrays are sequence types and behave very much like lists, except that the type of objects stored in them is constrained. The type is specified at object creation time by using a type code, which is a single character. The following type codes are defined: Type Code Table.

This type constraint is done to allow an efficient array implementation on the interpreter side, CPython for example. The type codes are a bridge between Python being dynamically typed and C being statically typed (in case of CPython).

Otherwise using a list, you will usually take some performance loss since a list can handle all types. I should caveat that the performance loss is negligible for smaller data-sets/operation rates.

Kamaji
  • 56
  • 1
1

Since your main concern is performance and you are dealing with numbers then Python's array module will be your answer. From the official Python 3 docs:

This module defines an object type which can compactly represent an array of basic values: characters, integers, floating point numbers. Arrays are sequence types and behave very much like lists, except that the type of objects stored in them is constrained. The type is specified at object creation time by using a type code, which is a single character. The following type codes are defined: Type Code Table.

This type constraint is done to allow an efficient array implementation on the interpreter side, CPython for example. The type codes are a bridge between being dynamically typed (Python) and statically typed (C in case of CPython).

Otherwise using a list, you will usually take some performance loss since a list can handle all types. I should caveat that the performance loss is negligible for smaller data-sets/operation rates.

Kamaji
  • 56
  • 1
-1

Performance issues are due to storage and access of the storage of variables. Lists are stored in the form of nodes with two blocks - storage value and location indicator for next node. The storage of array is continuous. Such storage has varying impact on accessing and modifying data stored in them. This video details about the implications - https://www.youtube.com/watch?v=lC-yYCOnN8Q

  • 2
    A link to a solution is welcome, but please ensure your answer is useful without it: [add context around the link](//meta.stackexchange.com/a/8259) so your fellow users will have some idea what it is and why it’s there, then quote the most relevant part of the page you're linking to in case the target page is unavailable. [Answers that are little more than a link may be deleted.](//stackoverflow.com/help/deleted-answers) – Zoe Jul 05 '19 at 09:45