1

I have this rather simple script which generates 1000000000 (nine zeroes) numbers and then store in a file the generated numbers a how many times they were generated.

import random
import csv

dic = {}

for i in range(0, 1000000000):
    n = random.randint(0, 99999)
    if n in dic:
        dic[n] += 1
    else:
        dic[n] = 1
writer = csv.writer(open('output', 'w'))
for key, value in dic.iteritems():
    writer.writerow([key, value])
writer.close()

The script is exiting with a Killed message. According to this question What does 'killed' mean?, using dic.iteritems() should be enough for preventing such issue, but that's not the case.

So how could I proceed to accomplish such task?

Community
  • 1
  • 1

1 Answers1

5

It doesn't look like your problem is the dict. Your problem is here:

for i in range(0, 1000000000):
         ^^^^^^^^^^^^^^^^^^^^

On Python 2, that's a list 1000000000 items long, more than your system can handle. You want xrange, not range. xrange generates numbers on demand. (On Python 3, range does what xrange used to and xrange is gone.)

Oh, and if you think 11 GB should be enough for that list: not in Python. Try sys.getsizeof(0) to see how many bytes an int takes.

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • Still though, assuming 8 bytes a number, that comes to about 7gb, which doesn’t explain the killed status. – xrisk Dec 07 '15 at 17:54
  • @RishavKundu: I was actually just about to add something about that to the answer. – user2357112 Dec 07 '15 at 17:55
  • according to your edit, that `range` would be around 22GB.. xD using `xrange` now, still it hasn't finished but it has no crashed either –  Dec 07 '15 at 18:01