57

It should not be so hard. I mean in C,

int a[10]; 

is all you need. How to create an array of all zeros for a random size. I know the zeros() function in NumPy but there must be an easy way built-in, not another module.

PythontoBeLoved
  • 663
  • 1
  • 5
  • 5
  • 2
    Python doesn't have a built-in array data structure. The closest you get to that are lists. – int3 Dec 07 '09 at 13:17
  • 4
    Surprisingly, nobody has actually asked what you need this for. Usually lists are just fine, regardless of the fact that you could store other stuff in them (they're just lists of references to other objects, which can be anything). But maybe there's some reason that wouldn't work well for you... – Peter Hansen Dec 07 '09 at 13:53
  • 3
    I highly recommend the Python tutorial: http://docs.python.org/tutorial/ It'll only take a couple hours of your time. – Jason Orendorff Dec 07 '09 at 14:27

8 Answers8

49

If you are not satisfied with lists (because they can contain anything and take up too much memory) you can use efficient array of integers:

import array
array.array('i')

See here

If you need to initialize it,

a = array.array('i',(0 for i in range(0,10)))
Mohamed Moanis
  • 477
  • 7
  • 18
yu_sha
  • 4,290
  • 22
  • 19
33

two ways:

x = [0] * 10
x = [0 for i in xrange(10)]

Edit: replaced range by xrange to avoid creating another list.

Also: as many others have noted including Pi and Ben James, this creates a list, not a Python array. While a list is in many cases sufficient and easy enough, for performance critical uses (e.g. when duplicated in thousands of objects) you could look into python arrays. Look up the array module, as explained in the other answers in this thread.

catchmeifyoutry
  • 7,179
  • 1
  • 29
  • 26
  • 2
    This is a list. It can contain objects of any type, not just integers. And it uses much more RAM than needed for integers. – yu_sha Dec 07 '09 at 13:16
  • 1
    Not only that, range returns a list too. So the second line will at least use twice the memory of the resulting list. – pi. Dec 07 '09 at 13:21
  • 2
    Be careful when multiplying lists -- it will give you trouble with mutable objects. Multiplication doesn't clone items, but merely gives you the very same object appearing multiple times in a list. Try this: `a = [[1]]*3; a[1].append(2)`. Therefore, appending to `a[1]` will really change all the items of a and give you `[[1,2],[1,2],[1,2]]`. – badp Dec 07 '09 at 13:29
  • 3
    and replace `xrange` back to `range` in py3k – SilentGhost Dec 07 '09 at 13:44
  • Sure, but i find these lines more readable then using the array module, for example. For 10 integers I wouldn't be too concerned about memory either (in most cases). Besides, the question was "an easy way built-in, no another module". – catchmeifyoutry Dec 07 '09 at 13:46
  • 1
    +1. One can contrive of corner cases where these don't apply, but your answers are the natural, pythonic solutions to the question. – Justin R. Dec 07 '09 at 22:04
11
>>> a = [0] * 10
>>> a
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Pär Wieslander
  • 28,374
  • 7
  • 55
  • 54
6

Use the array module. With it you can store collections of the same type efficiently.

>>> import array
>>> import itertools
>>> a = array_of_signed_ints = array.array("i", itertools.repeat(0, 10))

For more information - e.g. different types, look at the documentation of the array module. For up to 1 million entries this should feel pretty snappy. For 10 million entries my local machine thinks for 1.5 seconds.

The second parameter to array.array is a generator, which constructs the defined sequence as it is read. This way, the array module can consume the zeros one-by-one, but the generator only uses constant memory. This generator does not get bigger (memory-wise) if the sequence gets longer. The array will grow of course, but that should be obvious.

You use it just like a list:

>>> a.append(1)
>>> a.extend([1, 2, 3])
>>> a[-4:]
array('i', [1, 1, 2, 3])
>>> len(a)
14

...or simply convert it to a list:

>>> l = list(a)
>>> len(l)
14

Surprisingly

>>> a = [0] * 10000000

is faster at construction than the array method. Go figure! :)

pi.
  • 21,112
  • 8
  • 38
  • 59
3
import numpy as np

new_array=np.linspace(0,10,11).astype('int')

An alternative for casting the type when the array is made.

2
a = 10 * [0]

gives you an array of length 10, filled with zeroes.

digitalarbeiter
  • 2,295
  • 14
  • 16
  • 5
    This is not called an array in Python, it is a list – Ben James Dec 07 '09 at 13:14
  • I believe the questioner took the "array" terminology from C, and looks for a close built-in alternative in Python. In a lot of cases, one would just use a list in Python. – catchmeifyoutry Dec 07 '09 at 14:10
  • catchmeifyoutry: I would still let the questioner know the correct terminology for what they are using, no matter what the C alternative is – Ben James Dec 07 '09 at 14:25
  • questioner knows the difference. That's why he insists using the 'array'. Thanks for anonymously talking about me. :-) – PythontoBeLoved Dec 07 '09 at 15:56
1
import random

def random_zeroes(max_size):
  "Create a list of zeros for a random size (up to max_size)."
  a = []
  for i in xrange(random.randrange(max_size)):
    a += [0]

Use range instead if you are using Python 3.x.

badp
  • 11,409
  • 3
  • 61
  • 89
1

If you need to initialize an array fast, you might do it by blocks instead of with a generator initializer, and it's going to be much faster. Creating a list by [0]*count is just as fast, still.

import array

def zerofill(arr, count):
    count *= arr.itemsize
    blocksize = 1024
    blocks, rest = divmod(count, blocksize)
    for _ in xrange(blocks):
        arr.fromstring("\x00"*blocksize)
    arr.fromstring("\x00"*rest)

def test_zerofill(count):
    iarr = array.array('i')
    zerofill(iarr, count)
    assert len(iarr) == count

def test_generator(count):
    iarr = array.array('i', (0 for _ in xrange(count)))
    assert len(iarr) == count

def test_list(count):
    L = [0]*count
    assert len(L) == count

if __name__ == '__main__':
    import timeit
    c = 100000
    n = 10
    print timeit.Timer("test(c)", "from __main__ import c, test_zerofill as test").repeat(number=n)
    print timeit.Timer("test(c)", "from __main__ import c, test_generator as test").repeat(number=n)
    print timeit.Timer("test(c)", "from __main__ import c, test_list as test").repeat(number=n)

Results:

(array in blocks) [0.022809982299804688, 0.014942169189453125, 0.014089107513427734]
(array with generator) [1.1884641647338867, 1.1728270053863525, 1.1622772216796875]
(list) [0.023866891860961914, 0.035660028457641602, 0.023386955261230469]
u0b34a0f6ae
  • 48,117
  • 14
  • 92
  • 101
  • interesting, python really optimizes initialization. I get on ubuntu 9.04, python 2.6.2 (I truncated the output a little): (array in blocks) [0.0191, 0.0180, 0.0170] (array with generator) [0.9199, 0.9179, 0.6761] (list) [0.0069, 0.0074, 0.0064] so on my machine lists are significantly faster. kaizer.se, what OS/python are you running? – catchmeifyoutry Dec 08 '09 at 01:18
  • This was from my venerable iBook, a laptop 5 years of age w PowerPC G4 running Debian/Linux, of course. No doubt there are faster machines around :-) – u0b34a0f6ae Dec 08 '09 at 11:20
  • My point is not to compare our machines, but to compare the time differences on the machines. Your results show more or less (array in blocks) in the same order of speed as (list), while on my machine (or python implementation) the (list) method is a factor faster. – catchmeifyoutry Dec 09 '09 at 09:43
  • Oh, how dumb of me, forgot some information like that it was using Python 2.5.4 on Debian. Testing shows no difference across Python 2.6 to Python2.5 here. – u0b34a0f6ae Dec 09 '09 at 13:54
  • The fastest array initialization is `array.array('i', [0]) * count` because it can allocate the exact memory size directly without reallocation and without a big temporary list. see http://stackoverflow.com/a/3216322/448474 – hynekcer Dec 06 '12 at 00:50