1

How do I (efficiently) loop through huge ranges in Python? Whenever I try to run something like-

for x in range(105, 10000000000000):
    #do stuff

it takes an eternity to complete. I've tried many things like iter, set, xrange but still no success. Is there any external module which can help me do that?

Community
  • 1
  • 1
ShuklaSannidhya
  • 8,572
  • 9
  • 32
  • 45
  • you may have to use py3x here, as both `xrange()` and `range()` will not work for `10000000000000`. – Ashwini Chaudhary Apr 13 '13 at 17:19
  • Well, you are indeed iterating over a huge range, and doing even miniscule work each iteration will take a while. What are you doing inside the loop that requires to be iterated that many times? – Joachim Isaksson Apr 13 '13 at 17:19
  • related: [`xrange(2**100)` -> OverflowError: long int too large to convert to int](http://stackoverflow.com/questions/1482480/xrange2100-overflowerror-long-int-too-large-to-convert-to-int) – jfs Apr 13 '13 at 17:21
  • 4
    Is this related to **google codeJam** somehow? – Ashwini Chaudhary Apr 13 '13 at 17:22
  • @AshwiniChaudhary Yeah, I failed in "Fair-and-square" Large Data set. My solution took more than 8 minutes. I did solve the small one though. I was wondering if there was a way to solve this in Python. I read many articles online saying that python is slow. – ShuklaSannidhya Apr 13 '13 at 17:31
  • Yes, it is possible. I did it with a `O(sqrt(N)+T)` algorithm, which reduces the problem size to `10^11`. And can be further improved to work in almost `O(T)` time. But I am still stuck at the 10^100 data set.Where `N` is the range and `T` the number of testcases :) – Ashwini Chaudhary Apr 13 '13 at 17:34
  • 1
    Google codejam will allow you to download anyone's code tomorrow(after system checks). You can add me there using my handle: **xonixMonty** – Ashwini Chaudhary Apr 13 '13 at 17:49

2 Answers2

4

Given the range, I think it's safe to say that almost regardless of what you do or how you do it, it's going to take a while.

To be more specific, you're executing approximately 1E+13 iterations of your loop. With decent code and a reasonable fast processor, you might be able to execute around 4E+9 instructions per second (e.g., 2 instructions per clock cycle at 2 GHz). That obviously isn't exact, but for the moment let's just go with it, and see where we end up.

So, even if we assume each iteration of the loop requires executing only one, single-cycle instruction, it's going to take approximately: 1E+13/4E+9 = 2.5E3 seconds = ~42 minutes. Given that you're doing some work inside the loop, and there's some overhead for the loop itself, we're clearly going to be execution more than one machine-code instruction per iteration. If we have to execute, say, 1000 instructions per iteration (may be a reasonable first approximation, given that you haven't told us anything about what you're doing), then we're looking at something like 600-700 hours for the loop to execute.

Bottom line: changing how you represent your range may help, but if you need to iterate across a range this large, there's no getting away from the fact that it's going to take a while.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
2

Assuming you have a 3GHz processor, and that you just need one cycle for each entry in the range, you would need ~3,333 seconds (~1 hour) to process it.

This leads me to think that there's somehting fundamentally wrong with what you're doing. Maybe you should restructure your problem.

Elmar Peise
  • 14,014
  • 3
  • 21
  • 40