10

I'm currently working through the problems on Project Euler, and so far I've come up with this code for a problem.

from itertools import combinations
import time

def findanums(n):
    l = []
    for i in range(1, n + 1):
        s = []
        for j in range(1, i):
            if i % j == 0:
                s.append(j)
        if sum(s) > i:
            l.append(i)
    return l

start = time.time() #start time

limit = 28123

anums = findanums(limit + 1) #abundant numbers (1..limit)
print "done finding abundants", time.time() - start

pairs = combinations(anums, 2)
print "done finding combinations", time.time() - start

sums = map(lambda x: x[0]+x[1], pairs)
print "done finding all possible sums", time.time() - start

print "start main loop"
answer = 0
for i in range(1,limit+1):
    if i not in sums:
        answer += i
print "ANSWER:",answer

When I run this I run into a MemoryError.

The traceback:

File "test.py", line 20, in <module>
    sums = map(lambda x: x[0]+x[1], pairs)

I've tried to prevent it by disabling garbage collection from what I've been able to get from Google but to no avail. Am I approaching this the wrong way? In my head this feels like the most natural way to do it and I'm at a loss at this point.

SIDE NOTE: I'm using PyPy 2.0 Beta2(Python 2.7.4) because it is so much faster than any other python implementation I've used, and Scipy/Numpy are over my head as I'm still just beginning to program and I don't have an engineering or strong math background.

Jesse Neff
  • 103
  • 1
  • 1
  • 6

3 Answers3

4

As Kevin mention in the comments, your algorithm might be wrong, but anyway your code is not optimized.

When using very big lists, it is common to use generators, there is a famous, great Stackoverflow answer explaining the concepts of yield and generator - What does the "yield" keyword do in Python?

The problem is that your pairs = combinations(anums, 2) doesn't generate a generator, but generates a large object which is much more memory consuming.

I changed your code to have this function, since you iterating over the collection only once, you can lazy evaluate the values:

def generator_sol(anums1, s):
      for comb in itertools.combinations(anums1, s):
        yield comb

Now instead of pairs = combinations(anums, 2) which generates a large object. Use:

pairs = generator_sol(anums, 2)

Then, instead of using the lambda I would use another generator:

sums_sol = (x[0]+x[1] for x in pairs)

Another tip, you better look at xrange which is more suitable, it doens't generate a list but an xrange object which is more efficient in many cases (such as here).

Community
  • 1
  • 1
Ofiris
  • 6,047
  • 6
  • 35
  • 58
2

Let me suggest you to use generators. Try changing this:

sums = map(lambda x: x[0]+x[1], pairs)

to

sums = (a+b for (a,b) in pairs)

Ofiris solution is also ok but implies that itertools.combinations return a list when it's wrong. If you are going to keep solving project euler problems have a look at the itertools documentation.

Alex
  • 1,066
  • 9
  • 13
  • What do you mean by `combinations` returns a list when it's wrong? – Jesse Neff Apr 18 '13 at 20:45
  • 1
    I said combination returns a `list` and its not correct, fixed it anyhow. – Ofiris Apr 18 '13 at 20:46
  • 1
    combinations is a generator thus you don't need to worry about `pairs = combinations(anums, 2)`, it's not memory hungry. And is very good practice :) – Alex Apr 18 '13 at 20:49
  • @Alex, I figured as much, it was the mapping of the sums of the tuples that gave me a memory error. – Jesse Neff Apr 19 '13 at 11:54
1

The issue is that anums is big - about 28000 elements long. so pairs must be 28000*28000*8 bytes = 6GB. If you used numpy you could cast anums as a numpy.int16 array, in which case the result pairs would be 1.5GB - more manageable:

import numpy as np
#cast
anums = np.array([anums],dtype=np.int16)
#compute the sum of all the pairs via outer product
pairs = (anums + anums.T).ravel()
Patrick Mineault
  • 741
  • 5
  • 11
  • 1
    As I said in my post I'm still quite green and numpy's magic is still out of my grasp at the moment as I am still learning the normal Python libraries. But I appreciate this answer nonetheless as it gives me a taste of what I'd be able to do once I learn enough to use numpy/scipy! – Jesse Neff Apr 18 '13 at 20:46