I recently ask a question on the optimization of a mask function in Matlab. I got two answers that help me a lot but it appears that all Matlab solutions seem, according to my timing, much slower than one of the Numpy solutions. The code for the different functions can be found in my previous question but to give an idea of what I am doing, I give the Numpy "loop" solution, which is of course not the fastest one but which may be the simplest to read:
def dealiasing2d(where, data):
nk, n0, n1 = data.shape
for i0 in xrange(n0):
for i1 in xrange(n1):
if where[i0, i1]:
data[:, i0, i1] = 0.
I obtain (with Matlab R2014b and a "basic" Numpy 1.9.1 linked with Blas and Lapack) (n0 = n1 = N
):
N = 500 ; nk = 500:
Method | time (s) | normalized
----------------|----------|------------
Numpy | 0.05 | 1.0
Numpy loop | 0.05 | 1.0
Matlab find | 0.74 | 14.8
Matlab bsxfun2 | 0.76 | 15.2
Matlab bsxfun | 0.78 | 15.6
Matlab loop | 0.78 | 15.6
Matlab repmat | 0.89 | 17.8
N = 500 ; nk = 100:
Method | time (s) | normalized
----------------|----------|------------
Numpy | 0.01 | 1.0
Numpy loop | 0.03 | 3.0
Matlab find | 0.15 | 13.6
Matlab bsxfun2 | 0.15 | 13.6
Matlab bsxfun | 0.16 | 14.5
Matlab loop | 0.16 | 14.5
Matlab repmat | 0.18 | 16.4
N = 2000 ; nk = 10:
Method | time (s) | normalized
----------------|----------|------------
Numpy | 0.02 | 1.0
Matlab find | 0.23 | 13.8
Matlab bsxfun2 | 0.23 | 13.8
Matlab bsxfun | 0.26 | 15.6
Matlab repmat | 0.28 | 16.8
Matlab loop | 0.34 | 20.4
Numpy loop | 0.42 | 25.1
It seems to me that these results are very strange. For me, Numpy and Matlab are quite similar in terms of scientific computing so the performance should be similar, while here there is more than a factor 10! So my first guess is that there is something wrong in the way I compare the two languages. Another possibility could be a problem with my Matlab setup, but I don't understand why. Or a real deep difference between Matlab and Numpy?
Can anyone time these functions to verify these results? Do you have an idea why Matlab seems so much slower than Python in this simple case?
To time the Matlab functions, I use a file with :
N = 500;
n0 = N;
n1 = N;
nk = 500;
disp(['N = ', num2str(N), ' ; nk = ', num2str(nk)])
where = false([n1, n0]);
where(1:100, 1:100) = 1;
data = (5.+1i)*ones([n1, n0, nk]);
disp('time dealiasing2d_loops:')
time = timeit(@() dealiasing2d_loops(where, data));
disp([' ', num2str(time), ' s'])
disp('time dealiasing2d_find:')
time = timeit(@() dealiasing2d_find(where, data));
disp([' ', num2str(time), ' s'])
disp('time dealiasing2d_bsxfun:')
time = timeit(@() dealiasing2d_bsxfun(where, data));
disp([' ', num2str(time), ' s'])
disp('time dealiasing2d_bsxfun2:')
time = timeit(@() dealiasing2d_bsxfun2(where, data));
disp([' ', num2str(time), ' s'])
disp('time dealiasing2d_repmat:')
time = timeit(@() dealiasing2d_repmat(where, data));
disp([' ', num2str(time), ' s'])
I measure the performance of the Python functions with
from __future__ import print_function
import numpy as np
from timeit import timeit, repeat
import_lines = {
'numpy_bad': 'from dealiasing_numpy_bad import dealiasing2d as dealiasing',
'numpy': 'from dealiasing_numpy import dealiasing'}
tools = import_lines.keys()
time_approx_one_bench = 5.
setup = """
import numpy as np
N = 500
n0, n1 = N, N
nk = 500
where = np.zeros((n0, n1), dtype=np.uint8)
where[0:100, 0:100] = 1
data = (5.+1j)*np.ones((nk, n0, n1), dtype=np.complex128)
"""
exec(setup)
print('n0 = n1 = {}, nk = {}'.format(N, nk))
print(13*' ' + 'min mean')
durations = np.empty([len(tools)])
for it, tool in enumerate(tools):
setup_tool = import_lines[tool] + setup
duration = timeit(setup_tool + 'dealiasing(where, data)', number=1)
nb_repeat = int(round((time_approx_one_bench - duration)/duration))
nb_repeat = max(1, nb_repeat)
ds = np.array(repeat('dealiasing(where, data)',
setup=setup_tool, number=1, repeat=nb_repeat))
duration = ds.min()
print('{:11s} {:8.2e} s ; {:8.2e} s'.format(
tool.capitalize() + ':', duration, ds.mean()))
durations[it] = duration
fastest = tools[durations.argmin()].capitalize()
durations = durations / durations.min()
print('Durations normalized by the fastest method (' + fastest + '):')
for it, tool in enumerate(tools):
print('{:11s} {:8.3f}'.format(tool.capitalize() + ':', durations[it]))