1

My project involves heavy looping over stocks and stat calculations. It was written in python3. As the data gets bigger, I feel the script performance is quite slow. I attempted lua because of its fame on speed, and tried to do some tests as below, also compared to python2 as a benchmark.

Only a simple loop as a test code:

lua version

for i=1,100,1 do
    for j=1,100,1 do
        print(i*j)
    end
end

python version

for i in range(1,101):
    for j in range(1,101):
        print(i*j)

the results are as follows (tried a few time and pick the best for each group):

lua5.2.3: 0.461sec
python2.7.6: 0.429sec
python3.4: 0.85sec

What surprised me is that python2 is around 2x faster than python3.

Why? and even with a simple loop?

I thought python3 is the future for python, so I learned python3 from the beginning.

Do I really need to port back my code to python2, or any tweak I could with looping to enhance its performance in python3?

inspectorG4dget
  • 110,290
  • 27
  • 149
  • 241
timeislove
  • 1,075
  • 1
  • 9
  • 14
  • How are you determining how long it takes to run? – Alexander O'Mara Sep 05 '14 at 03:17
  • 1
    If your goal is to make calculations go fast, why are you measuring `print`? – Robᵩ Sep 05 '14 at 03:21
  • Please take some time to fix the spelling. Words like **cuz** and **i** lower the quality of the question. – Yu Hao Sep 05 '14 at 03:24
  • Are you sure that python3.4 takes 0.85 seconds and not 0.085 seconds? – inspectorG4dget Sep 05 '14 at 03:26
  • My best guess is that is it because `print` is a language construct in python2, but a function in python3 with more complex features. Using `print` or anything that generates output in a performance test is not a good indication of how fast code is. – Alexander O'Mara Sep 05 '14 at 03:26
  • 1
    @AlexanderO'Mara: you might be onto something. Simply doing the multiplication (without `print`) is about an order of magnitude faster – inspectorG4dget Sep 05 '14 at 03:29
  • guys, pls note the posted code is TEST CODES, so don't judge if i use print function in the code, though i did need to print out something in the looping of my project. i used linux time to take the measurement. The point is why running exactly the same python code, python3 is 2x slower. it is not talking about 1/100 diff, it is X diff. – timeislove Sep 05 '14 at 03:41
  • @timeislove They are not judging... they are pointing out that the python3 version is slower BECAUSE of the print function... in python2 it is a statement and thus take less resources to call which is why it is faster. – SethMMorton Sep 05 '14 at 03:42
  • 1
    You could probably speed up your code by buffering your output into a string and calling the print statement once at the end, or flushing it every some many times. – Alexander O'Mara Sep 05 '14 at 03:45
  • i wonder the 2X diff in the performance of running the same code on python3 and python2. i did not do a lab test. in reality the loop and simple print will exist in my project code. – timeislove Sep 05 '14 at 03:46
  • @SethMMorton, tks for you pointing out that... what you said is a way to improve the TEST performance. but in reality i may need to provide kind of progress status (printout to console) in the looping, also involving opening file(csv) and processing with pandas... so, could u advise anything that i should avoid in python3 in comparsion to python2? – timeislove Sep 05 '14 at 03:50
  • There is nothing I would way to avoid in PY3 vs. PY3. If your bottleneck in your program is printing status, then your program is probably efficient enough. BTW, printing to the screen is slow in any language (even C) because it is an operation controlled by the OS. – SethMMorton Sep 05 '14 at 03:54
  • @SethMMorton, not really about the print function you mentioned. i just replaced the print function with pass. the python3 finished the looping in 0.108sec, while python2 finished in 0.053sec, still 2X time diff. – timeislove Sep 05 '14 at 03:54
  • 1
    In the python2 version, use xrange, not range. range in PY2 and range in PY3 are not the same. Only then will you get an apples to apples comparison. – SethMMorton Sep 05 '14 at 03:56
  • @SethMMorton, so what is PY2 range equivalent in PY3? isn't a simple list? – timeislove Sep 05 '14 at 04:02
  • tks for u guys pointing to the print and range implementation diff in PY2 and PY3. i then explicitly replace the range() with a simple list, [1,2,3,4,5], and no print out but the PY2 still outperform 2X... so, it is down to the for-loop implementation diff? you may try that simple code on your computer. – timeislove Sep 05 '14 at 04:13
  • 2
    Last comment. You should really replace 101 with something large like 1000000 to see any real effect. – SethMMorton Sep 05 '14 at 04:49

1 Answers1

1

I've increased your loops and disable the output (it's much slower when it's displayed).
And I'm no python expert. But you can speed up your python code with the jit compiler pypy e.g. (but still slower than luajit.) Furthermore, this topic about python3 and python2 might be interesting for you too.

python

r=0
for i in range(1,10000):
    for j in range(1,10000):
        r=i*j

python3

$ time python3 loop.py 

real    0m16.612s
user    0m16.610s
sys 0m0.000s

python2

$ time python2 loop.py 

real    0m11.218s
user    0m11.190s
sys 0m0.007s

pypy

$ time pypy loop.py 

real    0m0.923s
user    0m0.900s
sys 0m0.020s

lua

local r=0
for i=1,10000,1 do
    for j=1,10000,1 do
        r=i*j
    end
end

lua 5.2.3

$ time lua loop.lua 

real    0m1.123s
user    0m1.120s
sys 0m0.000s

luajit

$ time luajit loop.lua 

real    0m0.074s
user    0m0.073s
sys 0m0.000s
Community
  • 1
  • 1
Markus
  • 2,998
  • 1
  • 21
  • 28