7

Questions with similar titles are about Python lists or NumPy. This is about the array.array class part of the standard Python library, see https://docs.python.org/2/library/array.html

The fasted approach I came up with (for integer types) is to use array.fromfile with /dev/zero. This is

  • about 27 times faster than array.array('L', [0] * size), which temporarily requires more than twice the memory than for the final array,
  • about 4.7 times faster than arrar.array('L', [0]) * size
  • and over 200 times faster than using a custom iterable object (to avoid creating a large temporary list).

However, /dev/zero may be unavailable on some platforms. Is there a better way to do this without NumPy, non-standard modules or my own c-extension?

Demonstrator code:

import array
import sys
import time

size = 100 * 1000**2
test = sys.argv[1]

class ZeroIterable:
    def __init__(self, size):
        self.size = size
        self.next_index = 0
    def next(self):
        if self.next_index == self.size:
            raise StopIteration
        self.next_index = self.next_index + 1
        return 0
    def __iter__(self):
        return self

t = time.time()
if test == 'Z':
    myarray = array.array('L')
    f = open('/dev/zero', 'rb')
    myarray.fromfile(f, size)
    f.close()
elif test == 'L':
    myarray = array.array('L', [0] * size)
elif test == 'S':
    myarray = array.array('L', [0]) * size
elif test == 'I':
    myarray = array.array('L', ZeroIterable(size))     
print time.time() - t
Joachim Wagner
  • 860
  • 7
  • 16
  • I found http://stackoverflow.com/questions/2214651/efficient-python-array-with-100-million-zeros but it is more about fast access (updating elements, in particular incrementing counters) than about initialisation. – Joachim Wagner May 21 '16 at 12:14
  • I found http://stackoverflow.com/questions/3214288/what-is-the-fastest-way-to-initialize-an-integer-array-in-python but the question specifically asks for a non-zero value. – Joachim Wagner May 21 '16 at 12:17
  • 2
    I can't come up with a faster method either. For non-zero values, `array('L', islice(repeat(value), size)` is pretty good too. – Martijn Pieters Sep 15 '18 at 21:13

1 Answers1

3

Updated to Python 3 and added the 'B' method:

import array
import sys
import time

size = 100 * 1000**2
test = sys.argv[1]

class ZeroIterable:
    def __init__(self, size):
        self.size = size
        self.next_index = 0
    def __next__(self):
        if self.next_index == self.size:
            raise StopIteration
        self.next_index = self.next_index + 1
        return 0
    def __iter__(self):
        return self

t = time.time()
if test == 'Z':
    myarray = array.array('L')
    f = open('/dev/zero', 'rb')
    myarray.fromfile(f, size)
    f.close()
elif test == 'L':
    myarray = array.array('L', [0] * size)
elif test == 'S':
    myarray = array.array('L', [0]) * size
elif test == 'I':
    myarray = array.array('L', ZeroIterable(size))
elif test == 'B':
    myarray = array.array('L', bytes(size * 8))
print(len(myarray))
print(time.time() - t)

The 'S' method (array.array('L', [0]) * size) wins:

$ python3 --version
Python 3.7.3
$ python3 z.py Z
100000000
1.1691830158233643
$ python3 z.py L
100000000
2.712920665740967
$ python3 z.py S
100000000
0.6910817623138428
$ python3 z.py B
100000000
0.9187061786651611
$ python3 z.py I
100000000
62.862160444259644
0xF
  • 3,214
  • 1
  • 25
  • 29