6

Say I wanted to create an array (NOT list) of 1,000,000 twos in python, like this:

array = [2, 2, 2, ...... , 2]

What would be a fast but simple way of doing it?

Eddy
  • 6,661
  • 21
  • 58
  • 71
  • 1
    I know practically no Python, but could it be something like `array = [2 for x in 1..1000000]`? – Michael Myers Jul 09 '10 at 15:52
  • This previous question might help - http://stackoverflow.com/questions/1859864/how-to-create-an-integer-array-in-python – chauncey Jul 09 '10 at 15:52
  • @mmyers: Your suggestion is not valid syntax; you possibly mean `[2 for x in xrange(1000000)]`; `[2] * 1000000` would be faster and simpler; however these produce a `list` -- `array` and `list` mean different things in Python. – John Machin Jul 09 '10 at 20:47
  • @John: mmyers had said he doesn't practically know python. so stop nitpicking :) Ofcourse appreciate the suggestions. – Vijay Dev Jul 09 '10 at 20:55
  • @Vijay Dev: Please stop conflating "educating" and "nitpicking". If @mmyers were to ask a question, I'd be glad to supply references to manuals and tutorials. Who appreciates what suggestions?? – John Machin Jul 09 '10 at 21:13
  • @John: Thanks. I figured I could get better by posting what I would think and having people correct me. (Now I wonder where I got the `1..1000000` from. Probably Ruby.) – Michael Myers Jul 09 '10 at 21:14
  • Possible duplicate of [NumPy array initialization (fill with identical values)](https://stackoverflow.com/questions/5891410/numpy-array-initialization-fill-with-identical-values) – Nico Schlömer Jul 05 '17 at 16:44

6 Answers6

18

The currently-accepted answer is NOT the fastest way using array.array; at least it's not the slowest -- compare these:

[source: johncatfish (quoting chauncey), Bartek]
python -m timeit -s"import array" "arr = array.array('i', (2 for i in range(0,1000000)))"
10 loops, best of 3: 543 msec per loop

[source: g.d.d.c]
python -m timeit -s"import array" "arr = array.array('i', [2] * 1000000)"
10 loops, best of 3: 141 msec per loop

python -m timeit -s"import array" "arr = array.array('i', [2]) * 1000000"
100 loops, best of 3: 15.7 msec per loop

That's a ratio of about 9 to 1 ...

John Machin
  • 81,303
  • 11
  • 141
  • 189
9

Is this what you're after?

# slower.
twosArr = array.array('i', [2] * 1000000)

# faster.
twosArr = array.array('i', [2]) * 1000000

You can get just a list with this:

twosList = [2] * 1000000

-- EDITED --

I updated this to reflect information in another answer. It would appear that you can increase the speed by a ratio of ~ 9 : 1 by adjusting the syntax slightly. Full credit belongs to @john-machin. I wasn't aware you could multiple the array object the same way you could do to a list.

g.d.d.c
  • 46,865
  • 9
  • 101
  • 111
5

A hybrid approach works fastest for me

$ python -m timeit -s"import array" "arr = array.array('i', [2]*100) * 10000"
100 loops, best of 3: 5.38 msec per loop

$ python -m timeit -s"import array" "arr = array.array('i', [2]) * 1000000"
10 loops, best of 3: 20.3 msec per loop
$ python -m timeit -s"import array" "arr = array.array('i', [2]*10) * 100000"
100 loops, best of 3: 6.69 msec per loop
$ python -m timeit -s"import array" "arr = array.array('i', [2]*100) * 10000"
100 loops, best of 3: 5.38 msec per loop
$ python -m timeit -s"import array" "arr = array.array('i', [2]*1000) * 1000"
100 loops, best of 3: 5.47 msec per loop
$ python -m timeit -s"import array" "arr = array.array('i', [2]*10000) * 100"
100 loops, best of 3: 6.13 msec per loop
$ python -m timeit -s"import array" "arr = array.array('i', [2]*100000) * 10"
10 loops, best of 3: 14.9 msec per loop
$ python -m timeit -s"import array" "arr = array.array('i', [2]*1000000)"
10 loops, best of 3: 77.7 msec per loop
John La Rooy
  • 295,403
  • 53
  • 369
  • 502
3

Using the timeit module you can kind of figure out what the fastest of doing this is:

First off, putting that many digits in a list will kill your machine most likely as it will store it in memory.

However, you can test the execution using something like so. It ran on my computer for a long time before I just gave up, but I'm on an older PC:

timeit.Timer('[2] * 1000000').timeit()

Ther other option you can look into is using the array module which is as stated, efficient arrays of numeric values

array.array('i', (2 for i in range(0, 1000000)))

I did not test the completion time of both but I'm sure the array module, which is designed for number sets will be faster.

Edit: Even more fun, you could take a look at numpy which actually seems to have the fastest execution:

from numpy import *
array( [2 for i in range(0, 1000000)])

Even faster from the comments:

a = 2 * ones(10000000)

Awesome!

xilpex
  • 3,097
  • 2
  • 14
  • 45
Bartek
  • 15,269
  • 2
  • 58
  • 65
  • 1
    Numpy also has dedicated factory functions: `a = 2 * ones(1000000)` – Philipp Jul 09 '10 at 16:27
  • @Philipp: That's awesome! This is why I love SO. Curiosity to answer a question leads to many learnings for myself. Cheers :-) – Bartek Jul 09 '10 at 16:29
  • If you can't fit a million-element list or array into your machine's memory, it's dead already. Also, I don't understand "It ran on my computer for a long time" ... see my answer for (a) how to do simple timing using `timeit` at the command prompt (b) how small the measured times (milliseconds!) are (4-year-old laptop running Win XP SP2) – John Machin Jul 09 '10 at 20:35
1
aList = [2 for x in range(1000000)]

or base on chauncey link

anArray =array.array('i', (2 for i in range(0,1000000)))
user347594
  • 1,256
  • 7
  • 11
1

If the initial value doesn't have to be non-zero and if you have /dev/zero available on your platform, the following is about 4.7 times faster than the array('L',[0])*size solution:

myarray = array.array('L')
f = open('/dev/zero', 'rb')
myarray.fromfile(f, size)
f.close()

In question How to initialise an integer array.array object with zeros in Python I'm looking for a better way.

Community
  • 1
  • 1
Joachim Wagner
  • 860
  • 7
  • 16