3

On Linux it takes 1.09171080828 secs.

On Windows it takes 2.14042000294 secs.

Code for the benchmark:

import time


def mk_array(num):
        return  [x for x in xrange(1,num)]

def run():
        arr = mk_array(10000000)
        x = 0
        start = time.time()
        x = reduce(lambda x,y: x + y, arr)
        done = time.time()
        elapsed = done - start
        return elapsed

if __name__ == '__main__':
        times = [run() for x in xrange(0,100)]
        avg = sum(times)/len(times)
        print (avg)

I am aware that the GIL creates more or less single threaded scripts.

Windows box is my Hyper-V host but should be beefy enough to run a single threaded script at full bore. 12-cores 2.93Ghz Intel X5670s, 72GB ram, etc.

Ubuntu VM has 4-cores and 8GB of ram.

Both are running Python 2.7.8 64-bit.

Why is windows half as fast?

edit: I've lopped two zeros and linux finishes in 0.010593495369 seconds, and windows in 0.171899962425 seconds. Thanks everyone, curiosity satisfied.

alex
  • 247
  • 2
  • 7

1 Answers1

3

It is because of the size of the long in windows, in windows a long is 32 bits in unix it is 64 bits so you are hitting Arbitrary-precision_arithmetic issues sooner which are costlier to allocate.

Related question Why are python's for loops so non-linear for large inputs

If you benchmark the xrange alone you will see a considerable difference.

The reasoning behind windows use of LLP64 seems to be compatibility with 32-bit code:

Another alternative is the LLP64 model, which maintains compatibility with 32-bit code by leaving both int and long as 32-bit. "LL" refers to the "long long integer" type, which is at least 64 bits on all platforms, including 32-bit environments.

Taken from wiki/64-bit_computing

Community
  • 1
  • 1
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
  • So the problem is that the 64-bit Windows version of Python uses `long` rather than `longlong` ? Any idea why? – Harry Johnston Mar 01 '15 at 02:26
  • @HarryJohnston, see [Include/intobject.h](https://hg.python.org/cpython/file/ee879c0ffa11/Include/intobject.h#l23) for the definition of a `PyIntObject`. When the value exceeds the maximum value of a C `long`, the interpreter switches to using an arbitrary precision `PyLongObject`. Python 3 does away with `PyIntObject` and only uses arbitrary-precision integers. – Eryk Sun Mar 01 '15 at 02:33
  • I understand why Visual C++ defines a long to be 32-bits for both 32-bit and 64-bit code. What I don't understand is why the 64-bit version of Python uses long rather than longlong on the Windows platform, and that doesn't seem to be addressed in any of those links. It's not a big deal, though, I was just curious. It may be as simple as "because all the code already uses long and we don't want to change it all". :-) (I'd have thought `#define long longlong` would take care of it in one hit, but I guess that might have unwelcome side-effects.) – Harry Johnston Mar 01 '15 at 02:43
  • @HarryJohnston, that would have required rewriting a lot of code to use a conditionally defined `PY_LONG` macro. I don't think anyone bothered with such a major rewrite because core devs don't like working on Python 2. Most of the effort goes into Python 3 development, which no longer uses `PyIntObject`. – Eryk Sun Mar 01 '15 at 02:46
  • @eryksun: that sounds reasonable. Thanks. – Harry Johnston Mar 01 '15 at 02:47