In Python and Matlab, I wrote codes that generate a matrix and populates it with a function of indices. Execution time of Python code is about 20 times longer than the execution time of Matlab code. Two functions with same results are written in python, the bWay()
is based on this answer
Here's the full Python code:
import numpy as np
from timeit import timeit
height = 1080
width = 1920
heightCm = 30
distanceCm = 70
centerY = height / 2 - 0.5;
centerX = width / 2 - 0.5;
constPart = height * heightCm / distanceCm
def aWay():
M = np.empty([height, width], dtype=np.float64);
for y in xrange(height):
for x in xrange(width):
M[y, x] = np.arctan(pow((pow((centerX - x), 2) + pow((centerY - y), 2)), 0.5) / constPart)
def bWay():
M = np.frompyfunc(
lambda y, x: np.arctan(pow((pow((centerX - x), 2) + pow((centerY - y), 2)), 0.5) / constPart), 2, 1## Heading ##
).outer(
np.arange(height),
np.arange(width),
).astype(np.float64)
and here's the full Matlab code:
height = 1080;
width = 1920;
heightCm = 30;
distanceCm = 70;
centerY = height / 2 + 0.5;
centerX = width / 2 + 0.5;
constPart = height * heightCm / distanceCm;
M = zeros(height, width);
for y = 1 : height
for x = 1 : width
M(y, x) = atan(((centerX - x)^2 + (centerY - y)^2)^0.5 / constPart);
end
end
Python execution time measured with timeit.timeit:
aWay() - 6.34s
bWay() - 6.68s
Matlab execution time measured with tic toc:
0.373s
To narrow it down I measured arctan
, squaring and looping times
Python:
>>> timeit('arctan(3)','from numpy import arctan', number = 1000000)
1.3365135641797679
>>> timeit('pow(3, 2)', number = 1000000)
0.11460829719908361
>>> timeit('power(3, 2)','from numpy import power', number = 1000000)
1.5427879383046275
>>> timeit('for x in xrange(10000000): pass', number = 1)
0.18364813832704385
Matlab:
tic
for i = 1 : 1000000
atan(3);
end
toc
Elapsed time is 0.179802 seconds.
tic
for i = 1 : 1000000
3^2;
end
toc
Elapsed time is 0.044160 seconds.
tic
for x = 1:10000000
end
toc
Elapsed time is 0.034853 seconds.
In all 3 cases, Python code execution time was multiple times longer.
Is there anything I could do to improve this python code performance?