Fast eigenvectors finding on a HPC using Qutip and slepc4py

Question

I am doing some numeric simulations of quantum computation, and I wish to find the eigenvectors of a big hermitian matrix (~2^14 rows/columns)

I am running on a 24 core/48 threads XEON machine. The code was originally written with the help of the Qutip library. I found out that the included eigenstates() function only utilizes a single thread on my machine so I am trying to find a faster way to do that.

I tried using scipy.linalg eig() and eigh() functions as well as scipy.sparse.linalg eig() and eigh() but both seem slower than the function built in Qutip.

I've seen some suggestion that I might get some speedup from using slepc4py, however the documentation of the package seems very lacking. I cant find out how to convert the numpy complex array to a SLEPC matrix.

A = PETSc.Mat().create()
A[:,:] = B[:,:]
# where B is a scipy array of complex type

TypeError: Cannot cast array data from dtype('complex128') to dtype('float64') according to the rule 'safe'

Welcome to Stackoverflow! It seems that your question is similar to http://stackoverflow.com/questions/29525041/petsc4py-creating-aij-matrix-from-csc-matrix-results-in-typeerror You will have to recompile PETSc and SLEPc and build-install petsc4py and slepc4py... If you are only interested by low energy pure quantum states, you will be interested by options EPS_SMALLEST_MAGNITUDE of EPSSetWhichEigenpairs() and EPSSetDimensions() in combination to EPSType like EPSARNOLDI or EPSLANCZOS. — francis, Feb 02 '17 at 17:45
By the way, [scipy.sparse.linalg.eigsh](https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.sparse.linalg.eigsh.html) may also prove helpful... — francis, Feb 02 '17 at 18:10

score 1 · Answer 1 · answered Feb 03 '17 at 03:19

1

The eigensolver in QuTiP uses the SciPy eigensolver. How many threads are used depends on the BLAS library that SciPy is linked to, as well as whether you are using the sparse or dense solver. In the dense case, the eigensolver will use multiple cores if the underlying BLAS takes advantage (e.g. Intel MKL). The sparse solver uses mostly sparse matvec operations which are memory bandwidth limited, and thus are most efficient using a single core. If you want all eigenvalues then you are basically stuck using dense solvers. However, if you need only a few., Such as the lowest few eigenstates, then sparse is the way to go.

answered Feb 03 '17 at 03:19

Paul Nation

384
2
6

Hi paul, thanks for the fast response. Using num[y.show_config() blas_mkl_info: define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)] library_dirs = ['/cs/labs/doria/oryonatan/anaconda3/envs/qutip/lib'] include_dirs = ['/cs/labs/doria/oryonatan/anaconda3/envs/qutip/include'] libraries = ['mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'iomp5', 'pthread'] I see that MKL is installed, however when using the eig functions, I still see in top that only one thread is being used, is that normal? when I tried using matlab eig function, all the cores were involved. – oyon Feb 05 '17 at 10:29
Answering myself : it seems like qutip is not set to use the maximum amount of cores in mkl. manually setting mkl to 24 solved the problem. import ctypes mkl_rt = ctypes.CDLL('libmkl_rt.so') mkl_get_max_threads = mkl_rt.mkl_get_max_threads mkl_rt.mkl_set_num_threads(ctypes.byref(ctypes.c_int(48))) this gave me a big performance boost. – oyon Feb 05 '17 at 11:53

score 0 · Answer 2 · edited May 23 '17 at 11:45

0

I ended up finding a simpler way to use all the cores , it seems like qutip didn't tell mkl to use all of the cores. in my python code,I added :

import ctypes
mkl_rt = ctypes.CDLL('libmkl_rt.so')
mkl_get_max_threads = mkl_rt.mkl_get_max_threads
mkl_rt.mkl_set_num_threads(ctypes.byref(ctypes.c_int(48)))

this forced Intel mkl to use all the cores, and gave me a nice speedup.

(answer from question)

edited May 23 '17 at 11:45

Community

1
1

answered Feb 05 '17 at 11:59

oyon

81
3

If your using the Anaconda Python distribution then there are convenience functions in the mkl module. – Paul Nation Feb 05 '17 at 13:50
I am using anaconda. are you referring to https://docs.continuum.io/mkl-service/ ? I tried `import mkl` but got `ImportError ` – oyon Feb 05 '17 at 14:28
Huh. Always seemed to work for me. On my install, mkl also uses all threads automatically. Also does the same on several computers at work. Not sure what the issue could be. – Paul Nation Feb 05 '17 at 14:30
Might be that I am running on a slurm cluster, and the cpu allocation is dynamic. Maybe when anaconda was installed there were less cores available. – oyon Feb 05 '17 at 16:43

Fast eigenvectors finding on a HPC using Qutip and slepc4py

2 Answers2