I am working on a scientific cluster, that has been recently upgraded by the administrator, and now my code is superslow, whereas it used to be decent. I am using python 3.4
The way this kind of things work is the following: I have to guess what the administrator may have changed and then ask him to make the opportune changes, because if I ask him a direct question we will not conclude anything.
So, I have run my code with a profiler and I have found that there are some routines that are called many times, these routines are:
- built-in method array (called ~10^5, execution time 0.003s)
- sort of numpy.ndarray (~5000, 0.03s)
- uniformof mtrand.RandomState (~2000, 0.03s)
My guess is that some of these libraries were parallelized in the previous installed version of python, for example being linked to mpi-parallelized or multi-threated math kernel libraries.
I would like to know if my guess is correct or if I have to think to something else, because my code itself has not changed.
The routines I have quoted here are the most relevant, because they account for 85% of the total time. in particular, array takes 55% if the total time. The efficiency of my code was degraded by a factor 10. Before talking with the system manager I would like to get confirmation that these routines do have a parallel version.
Of course I cannot test my code on the new and old configuration of the cluster, because the old configuration is gone. But I can see that on this cluster numpy.array
takes 8minutes, while on the other cluster that I have it takes 2seconds. From top
I can see that the memory used is always very low (~0.1%) while a single CPU is used at 100%.
In [3]: numpy.__config__.show()
lapack_info:
libraries = ['lapack']
library_dirs = ['/usr/lib64']
language = f77
atlas_threads_info:
libraries = ['satlas']
library_dirs = ['/usr/lib64/atlas']
define_macros = [('ATLAS_WITHOUT_LAPACK', None)]
language = c
include_dirs = ['/usr/include']
blas_opt_info:
libraries = ['satlas']
library_dirs = ['/usr/lib64/atlas']
define_macros = [('ATLAS_INFO', '"\\"3.10.1\\""')]
language = c
include_dirs = ['/usr/include']
atlas_blas_threads_info:
libraries = ['satlas']
library_dirs = ['/usr/lib64/atlas']
define_macros = [('ATLAS_INFO', '"\\"3.10.1\\""')]
language = c
include_dirs = ['/usr/include']
openblas_info:
NOT AVAILABLE
lapack_opt_info:
libraries = ['satlas', 'lapack']
library_dirs = ['/usr/lib64/atlas', '/usr/lib64']
define_macros = [('ATLAS_WITHOUT_LAPACK', None)]
language = f77
include_dirs = ['/usr/include']
lapack_mkl_info:
NOT AVAILABLE
blas_mkl_info:
NOT AVAILABLE
mkl_info:
NOT AVAILABLE
ldd /usr/lib64/python3.4/site-packages/numpy/core/_dotblas.cpython-34m.so
linux-vdso.so.1 => (0x00007fff46172000)
libsatlas.so.3 => /usr/lib64/atlas/libsatlas.so.3 (0x00007f0d941a0000)
libpython3.4m.so.1.0 => /lib64/libpython3.4m.so.1.0 (0x00007f0d93d08000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f0d93ae8000)
libc.so.6 => /lib64/libc.so.6 (0x00007f0d93728000)
libgfortran.so.3 => /lib64/libgfortran.so.3 (0x00007f0d93400000)
libm.so.6 => /lib64/libm.so.6 (0x00007f0d930f8000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f0d92ef0000)
libutil.so.1 => /lib64/libutil.so.1 (0x00007f0d92ce8000)
/lib64/ld-linux-x86-64.so.2 (0x00007f0d950e0000)
libquadmath.so.0 => /lib64/libquadmath.so.0 (0x00007f0d92aa8000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f0d92890000)
Numpy is already linked to atlas, and I see a link to libpthread.so (so I assume it is already multithreated, is it?).
On the other side, I updated the version of numpy from 1.8.2 to 1.9.2 and now array
method only takes 5 s instead of 300s. I think this is probably the reason of my code slowing down (maybe, did the system-adminstrator downgrade numpy version? who knows!)