-2

I need the fastest BLAS package for heavy matrix multiplication. I'm currently using the armadillo library included blas.

I've done some research and it pointed to OpenBLAS.

After some testing it didn't show any improvement. Any thoughts?

Artur D
  • 26
  • 2
  • You will always be trapped by the exponential number of operations required. Vectorization helps. Sometimes threading/OpenMP loop parallelism can help. Knowing the structure of the matrix, e.g., triangular, can help. In general, no, you're trapped by the number of operations needed. – scooter me fecit Mar 21 '16 at 17:55
  • @ScottM: Exponential? It's not even cubic. Exponential is faster than _any_ polynomic function. – MSalters Mar 21 '16 at 21:30
  • Can you tell us what testing you did? I am using MKL BLAS together with armadillo and that shows huge improvements, mostly due to (OpenMP) parallelization though. Also have a look at [this thread](http://stackoverflow.com/questions/17639155/fast-lapack-blas-for-matrix-multiplication) – Darkdragon84 Mar 22 '16 at 09:25
  • @MSalters: Anything raised to a power is exponential, even if the exponent is constant (vice linear or asymptotic.) – scooter me fecit Mar 22 '16 at 16:58
  • @Darkdragon84: Many years of experience in HPC. Similar issue arise in computing the FFT. You can always achieve better performance than the naive implementation, but there is a limit because the structure of the data in memory or processor throughput imposes one. – scooter me fecit Mar 22 '16 at 17:02
  • 1
    @ScottM: No, that's called polynomial (of which cubic is a special case). Exponential means that the variable appears in the exponent. x^2 is quadratic, 2^x is exponential. – MSalters Mar 22 '16 at 22:28
  • There is no BLAS included in Armadillo. See [the documentation](http://arma.sourceforge.net/download.html): _Before installing Armadillo, it's recommended to install LAPACK, BLAS and ATLAS, along with the corresponding development/header files. For faster performance, instead of using standard BLAS we recommend using the multi-threaded OpenBLAS library._ – Dirk Eddelbuettel Mar 23 '16 at 01:51
  • 1
    I think there is actually. That's why the documentation (that you quote) says _it's_ **recommended** _to install LAPACK, BLAS, ..._, but not necessary. – Darkdragon84 Mar 23 '16 at 03:11
  • There is a fallback. You don't want that. – Dirk Eddelbuettel Mar 23 '16 at 03:13
  • @MSalters: Fair point, shouldn't answer questions when sleep deprived. Doesn't change the underlying issue that you can't escape the $O(n^3)$ operations, you can only amortize them. Strassen's method is only applicable for really large matrices (IIRC, larger than 10,000 elements or thereabout.) – scooter me fecit Mar 23 '16 at 18:30
  • For N<10.000 matrix multiplication is O(1) anyway ;) – MSalters Mar 23 '16 at 19:44
  • @MSalters: You're talking about the Yuster/Zwick paper? Those are fairly special matrices. But now we're really going off topic for an on-hold SO post. – scooter me fecit Mar 23 '16 at 22:22
  • @ScottM: No, the smiley was there because O(N^3) for N<10^4 is just O(10^12) which formally is just O(1) – MSalters Mar 24 '16 at 08:10
  • @MSalters: "This is abuse. You want next door for an argument." :-) – scooter me fecit Mar 24 '16 at 22:49

1 Answers1

0

Be sure you're using the 64 bits package and that you have included it in armadillo.