13

I need to do multiplication on matrices. I'm looking for a library that can do it fast. I'm using the Visual C++ 2008 compiler and I have a core i7 860 so if the library is optimized for my configuration it's perfect.

rubenvb
  • 74,642
  • 33
  • 187
  • 332
user558209
  • 141
  • 1
  • 1
  • 6

8 Answers8

10

FWIW, Eigen 3 uses threads (OpenMP) for matrix products (in reply to above statement about Eigen not using threads).

Benoit Jacob
  • 351
  • 2
  • 3
8

BLAS is a de facto Fortran standard for all basic linear algebra operations (essentially multiplications of matrices and vectors). There are numerous implementations available. For instance:

  • ATLAS is free and supposedly self-optimizing. You need to compile it yourself though.
  • Goto BLAS is maintained by Kazushige Goto at TACC. He is very good at getting the last performance bit out of modern processors. It is only for academic use though.
  • Intel MKL provides optimised BLAS for Intel processors. It is not free, even for academic use.

Then, you may want to use a C++ wrapper, for instance boost::ublas.

If you program on distributed systems, there are PBLAS and ScaLAPACK which enable the use of message passing for distributed linear algebra operations. On a multicore machine, usually implementations of BLAS (at least Intel MKL) use threads for large enough matrices.

If you want more advanced linear algebra routines (eigenvalues, linear systems, least square, ...), then there is the other de facto Fortran standard LAPACK. To my knowledge, there is nothing to integrate it elegantly with C++ other than calling the bare Fortran routines. You have to write some wrappers to hide the Fortran calls and provide a sound type-checking implementation.

Alexandre C.
  • 55,948
  • 11
  • 128
  • 197
5

Look into Eigen. It should have all you need.

rubenvb
  • 74,642
  • 33
  • 187
  • 332
  • Eigen is only good at small matrices. It does not use threads for larger matrices. But it uses SSE2 when available. – Alexandre C. Dec 30 '10 at 15:23
  • Well, their own benchmark shows different... Heck, threading can be user implemented by doing block products if you really need it. – rubenvb Dec 30 '10 at 16:24
3

I have had good experience with Boost's uBLAS. It's a nice option if you're already using Boost.

Pedro d'Aquino
  • 5,130
  • 6
  • 36
  • 46
  • 2
    It uses whatever BLAS library which is installed on your system. Intel provides one which uses threads and vector instructions: google for Intel MKL. It is not free though. – Alexandre C. Dec 30 '10 at 15:11
  • I've had a really dreadful experience with Boost's uBLAS. It's entirely unintuitive and difficult to figure out. For example, how am I supposed to know that vector-matrix multiplication is done using `prod()` -- why not a `*` operator? Also, I can't even multiply two vectors. – Dmitri Nesteruk Nov 20 '12 at 20:21
1

You can use the GNU Scientific Library(GSL).

Here's a page describing the matrix operations available in the library, including multiplication(gsl_matrix_mul_elements()):

http://www.gnu.org/software/gsl/manual/html_node/Matrix-operations.html

And here are some links to get you started with using GSL with visual studio:

http://gladman.plushost.co.uk/oldsite/computing/gnu_scientific_library.php

http://www.quantcode.com/modules/smartfaq/faq.php?faqid=33

Shinnok
  • 6,279
  • 6
  • 31
  • 44
  • Really? Gnu code compiling on MSVC++? I mean if you really want to get performance you'd use an Atlas based BLAS but I doubt it would be that easy to build on Windows .. :) – Yttrill Dec 30 '10 at 14:33
1

it can't race with scientific libraries, but with visual c++ it is at hand

#include <windows.h>
#include <gdiplus.h>
#pragma comment (lib,"Gdiplus.lib")
using namespace Gdiplus;

int main()
{
    ULONG_PTR gpToken = 0;
    GdiplusStartup(&gpToken, &GdiplusStartupInput(), NULL);
    //lib inited

    Matrix A;
    A.Translate(10,20);

    Matrix B;
    B.Rotate(35.0);

    A.Multiply(&B);
    if (A.IsInvertible())
        A.Invert();
    if (!A.IsIdentity())
        A.RotateAt(120.0, PointF(10,10));

    //getting values
    REAL elements[6];
    A.GetElements(elements);

    //lib stopped
    GdiplusShutdown(gpToken);
    return 0;
}

so with this you can easily take the matrix multiplication obstacle (on Windows)

GdiPlus Matrix Documentation

ch0kee
  • 736
  • 4
  • 12
0

for more recent version of Visual Studio, you can use ScaLapack + MKL. A sample of code is provided here , with a tutorial on how to make it run.

http://code.msdn.microsoft.com/Using-ScaLAPACK-on-Windows-d16a5e76#content

Vincent
  • 444
  • 4
  • 6
0

There's an option to implement this yourself, perhaps using std::valarray because that may be parallelised using OpenMP: gcc certainly has such a version, MSVC++ probably does too.

Otherwise, the following tricks: one of the matrices should be transposed. Then you have:

AB[i,j] = Sum(k) A[i,k] B^t [j,k]

where you're scanning contiguous memory. If you have 8 cores you can fairly easily divide the set of [i,j] indices into 8, and give each core 1/8 of the total job. To make it even faster you can use vector multiply instructions, most compilers will provide a special function for this. The result won't be as fast as a tuned library but it should be OK.

If you're doing longer calculations such as polynomial evaluation, a threading evaluator which also has thread support (gak, two kind of threads) will do a good job even though it won't do low level tuning. If you really want to do stuff fast, you have to use a properly tuned library like Atlas, but then, you probably wouldn't be running Windows if you were serious about HPC.

Yttrill
  • 4,725
  • 1
  • 20
  • 29