how to speed up numpy.dot for a neural network? (matrix with about 4.5 million elements as result)

Question

I'm programming a neural network in python (3.6) and the part where i'm updating the weights, there is a matrix * vector calculation with numpy.dot().

The matrix has a size of 4500x1024 and the vector a size of 1024. The result is a matrix with around 4.5 million elements.

This calculation should be done a few thousand times (each iteration) but it is really slow. So for example 100 iterations already need a few minutes to complete.

I've already installed Intel's MKL Compiler (the full package from their website) in the hope, to speed up the calculation with numpy. But there is not really a difference in speed. Maybe it's not installed correctly?

... hmm, while i'm writing this post, i see that blas_mkl_info is not available and openblas still seems to be used? ...

blas_mkl_info:
NOT AVAILABLE
blis_info:
NOT AVAILABLE
openblas_info:
library_dirs = ['C:\projects\numpy-wheels\numpy\build\openblas']
libraries = ['openblas']
language = f77
define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
library_dirs = ['C:\projects\numpy-wheels\numpy\build\openblas']
libraries = ['openblas']
language = f77
define_macros = [('HAVE_CBLAS', None)]
lapack_mkl_info:
NOT AVAILABLE
openblas_lapack_info:
library_dirs = ['C:\projects\numpy-wheels\numpy\build\openblas']
libraries = ['openblas']
language = f77
define_macros = [('HAVE_CBLAS', None)]
lapack_opt_info:
library_dirs = ['C:\projects\numpy-wheels\numpy\build\openblas']
libraries = ['openblas']
language = f77
define_macros = [('HAVE_CBLAS', None)]

=================================================================

Ok, now i have installed numpy with mkl correctly following this post: How to install numpy+mkl for python 2.7 on windows 64 bit?

The performance increased a little bit but im still not happy with it. Are there any other improvements i can do?

`numpy` won't know that you installed MLK and automatically start using it, if that's what you were assuming. — juanpa.arrivillaga, Feb 03 '19 at 19:12
[Anaconda Python](https://anaconda.org/) natively ships with MKL as the underlying BLAS library. I'm not sure how much of a performance boost this would give you, but it is worth a shot. Also, I second @soobus' suggestion to try and use sparse matrices, if your data is sparse enough (density < 5% as a rule of thumb), and only if your vector is also somewhat sparse. — dennlinger, Feb 04 '19 at 08:35
i never used sparse matrices before. is there also a way to create the full matrix out of a sparse matrix? something like this: full matrix -> sparse matrix -> full matrix — Dennis, Feb 06 '19 at 11:21

how to speed up numpy.dot for a neural network? (matrix with about 4.5 million elements as result)

0 Answers0