2

I'm processing a huge number of combinations of items (from League of Legends), about 72 million, all of which are fed into a function that calculates how beneficial they are.

We're trying to find the best possible combination.

Ignoring the fact that there might be better ways, algorithmically speaking, to do this, can anyone tell me why I'm getting a memory error?

allpossiblei = itertools.combinations(items.keys(),5)
maxc = 0
i = 0
for combo in allpossiblei:
    icombo = [items[name] for name in combo]
    res, tcost = calcStats(icombo, 0.658,100,100)
    if res > maxc :
        maxc = res
        print str(res) + " " + str(res/tcost)
        print combo
        print float(i)/79208745.0
    if i % 500000 == 0:
        print str(float(i)/79208745.0) + "\n \n"
        gc.collect()
    i = i + 1

calcStats doesn't do anything except arithmetic using local variables.

This rapidly eats up 2gb+ of memory and quits in about 5 minutes. I thought itertools was supposed to provide a generator that wouldn't use up a ton of memory? I even threw in that gc.collect() statement but it doesn't seem to work. Any ideas?

Sushisource
  • 611
  • 6
  • 17
  • 2
    CPython uses refcounting as its primary GC scheme, so `gc.collect()` will only deal with circular references, meaning that in this case it shouldn't be doing anything (and is thus not necessary). – Chris Morgan Nov 24 '11 at 07:05
  • Provide all code that will make such problems. It should be ready to run, but not necessary same as yours as long it has same problems as yours.For me it looks like there is no problem in code you provided. – Luka Rahne Nov 24 '11 at 07:12
  • Here's the full code http://pastebin.com/E1MGhxxM – Sushisource Nov 24 '11 at 07:15
  • So you have 99 keys, and get `(99*98*97*96*95)/(5*4*3*2) == 71,523,144` (about 72 million) combinations of 5? This should execute fine: `for i in itertools.combinations(range(99), 5): pass`. – Eryk Sun Nov 24 '11 at 08:28

1 Answers1

0

Combinations creates a pool by consuming the whole iterator provided. There is no way around that. See the pseudocode in the docs for the function: http://docs.python.org/library/itertools.html#itertools.combinations

Armin Ronacher
  • 31,998
  • 13
  • 65
  • 69
  • So I'd have to make my own implementation to overcome this? – Sushisource Nov 24 '11 at 08:22
  • @Sushisource: You're passing it a list of keys, not an iterator. With all due respect to Armin, I don't think this is your problem. – Eryk Sun Nov 24 '11 at 08:32
  • Based on what we've been told, the size of such a "pool" would be only on the order of 100 items. – Karl Knechtel Nov 24 '11 at 11:21
  • @eryksun You're right. Changing items.keys() to items fixes the problem – Sushisource Nov 28 '11 at 05:39
  • @Sushisource: In Python 2.x `iteritems`, `iterkeys`, and `itervalues` return iterators instead of list copies. I'm not certain what your problem is/was unless the number of keys is orders of magnitude larger than I thought. – Eryk Sun Nov 28 '11 at 06:50