262

I am trying to build a histogram of counts... so I create buckets. I know I could just go through and append a bunch of zeros i.e something along these lines:

buckets = []
for i in xrange(0,100):
    buckets.append(0)

Is there a more elegant way to do it? I feel like there should be a way to just declare an array of a certain size.

I know numpy has numpy.zeros but I want the more general solution

NoDataDumpNoContribution
  • 10,591
  • 9
  • 64
  • 104
user491880
  • 4,709
  • 4
  • 28
  • 49
  • 4
    Python's lists are lists, not arrays. And in Python you don't declare stuff like you do in C: you define functions and classes (via def and class statements), and assign to variables which, if they don't exist already, are created magically on first assignment. Also, variables (and lists) are not memory regions that contain, but names refering to, objects. One object can be contained in only one memory region but can be referenced by several names. – pillmuncher Oct 30 '10 at 01:10
  • 1
    Python doesn't have "declarations", especially of containers with a size but unspecified contents. You want something, you write an expression. – John Machin Oct 30 '10 at 01:14
  • 3
    ...and the semicolons are completely unnecessary – bstpierre Oct 30 '10 at 02:03
  • 1
    **Not a duplicate**. The perceived need for an air-quotes empty list starts a different conversation about list allocation and assignment. Also there should be two landing pages for the different search terms, which the stats indicate are common. – Bob Stein Feb 24 '22 at 13:32

10 Answers10

512
buckets = [0] * 100

Careful - this technique doesn't generalize to multidimensional arrays or lists of lists. Which leads to the List of lists changes reflected across sublists unexpectedly problem

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
dan04
  • 87,747
  • 23
  • 163
  • 198
147

Just for completeness: To declare a multidimensional list of zeros in python you have to use a list comprehension like this:

buckets = [[0 for col in range(5)] for row in range(10)]

to avoid reference sharing between the rows.

This looks more clumsy than chester1000's code, but is essential if the values are supposed to be changed later. See the Python FAQ for more details.

OK.
  • 2,374
  • 2
  • 17
  • 20
29

You can multiply a list by an integer n to repeat the list n times:

buckets = [0] * 100
agf
  • 171,228
  • 44
  • 289
  • 238
mjhm
  • 16,497
  • 10
  • 44
  • 55
24

Use this:

bucket = [None] * 100
for i in range(100):
    bucket[i] = [None] * 100

OR

w, h = 100, 100
bucket = [[None] * w for i in range(h)]

Both of them will output proper empty multidimensional bucket list 100x100

VLL
  • 9,634
  • 1
  • 29
  • 54
meeDamian
  • 1,143
  • 2
  • 11
  • 24
19

use numpy

import numpy
zarray = numpy.zeros(100)

And then use the Histogram library function

fabrizioM
  • 46,639
  • 15
  • 102
  • 119
7

The question says "How to declare array of zeros ..." but then the sample code references the Python list:

buckets = []   # this is a list

However, if someone is actually wanting to initialize an array, I suggest:

from array import array

my_arr = array('I', [0] * count)

The Python purist might claim this is not pythonic and suggest:

my_arr = array('I', (0 for i in range(count)))

The pythonic version is very slow and when you have a few hundred arrays to be initialized with thousands of values, the difference is quite noticeable.

IAbstract
  • 19,551
  • 15
  • 98
  • 146
  • 1
    Hi, and why isn't the `array('I', [0] * count)` slow? I'd assume it will first create a full list and based on that full list then create the array, which sounds... terrible. I would guess that laziness would actually be quite beneficial in your last code snippet and make it run faster? – devoured elysium Jan 01 '20 at 16:33
  • I can't attest to the *why* ... I can modify my code based on performance results. I didn't include results already confirmed by consensus. – IAbstract Jan 06 '20 at 12:58
3

The simplest solution would be

"\x00" * size # for a buffer of binary zeros
[0] * size # for a list of integer zeros

In general you should use more pythonic code like list comprehension (in your example: [0 for unused in xrange(100)]) or using string.join for buffers.

AndiDog
  • 68,631
  • 21
  • 159
  • 205
  • 3
    I agree that the list comprehension looks more Pythonic. However, I timed it, and found that it's about 10x slower than the multiplication syntax. I know, something something preoptimization evil. – Lenna Jan 22 '13 at 20:39
  • I am creating an `array('I')` and was using `(0 for i in range(count))` to fill ... and it is very slow: 28000 items in the array. The multiplication syntax is much faster. If 'pythonic' equates to slow, then its out with the 'pythonic' and in with *fast*. – IAbstract Feb 18 '16 at 12:33
0

Depending on what you're actually going to do with the data after it's collected, collections.defaultdict(int) might be useful.

Russell Borogove
  • 18,516
  • 4
  • 43
  • 50
-1

Well I would like to help you by posting a sample program and its output

Program :-

t=input("")

x=[None]*t

y=[[None]*t]*t

for i in range(1,t+1):

      x[i-1]=i;
      for j in range(1,t+1):
            y[i-1][j-1]=j;

print x

print y

Output :-

2

[1, 2]

[[1, 2], [1, 2]]

I hope this clears some very basic concept of yours regarding their declaration. To initialize them with some other specific values,like initializing them with 0..you can declare them as :

x=[0]*10

Hope it helps..!! ;)

Barranka
  • 20,547
  • 13
  • 65
  • 83
Archit
  • 79
  • 1
-4

If you need more columns:

buckets = [[0., 0., 0., 0., 0.] for x in range(0)]
renatov
  • 5,005
  • 6
  • 31
  • 38