5

I have a very large two dimensional array and I need to compute vector operations on this array. NTerms and NDocs are both very large integers.

var myMat = new double[NTerms, NDocs];

I need to to extract vector columns from this matrix. Currently, I'm using for loops.

            col = 100;
            for (int i = 0; i < NTerms; i++)
            {
                myVec[i] = myMat[i, col];
            }

This operation is very slow. In Matlab I can extract the vector without the need for iteration, like so:

myVec = myMat[:,col];

Is there any way to do this in C#?

Soner Gönül
  • 97,193
  • 102
  • 206
  • 364
Leeor
  • 627
  • 7
  • 24
  • Do you have to possibility to transpose `myMat` on creation? Because then you would extract rows instead of columns which is more cache coherent and should be faster (although I don't know by what factor). Another option then would be to copy the memory with `Marshal.Copy`. Furthermore, you could try to parallelize with `Parallel.For`. – Nico Schertler Feb 11 '13 at 15:50
  • 1
    If you like working with Matlab, you may be interested in calling Matlab from C#, see http://stackoverflow.com/questions/5901664/calling-a-matlab-function-from-c-sharp for example. A a bit of searching will give you several results. – Dennis Jaheruddin Feb 11 '13 at 16:57

2 Answers2

5

There are no such constructs in C# that will allow you to work with arrays as in Matlab. With the code you already have you can speed up process of vector creation using Task Parallel Library that was introduced in .NET Framework 4.0.

Parallel.For(0, NTerms, i => myVec[i] = myMat[i, col]);

If your CPU has more than one core then you will get some improvement in performance otherwise there will be no effect.

For more examples of how Task Parallel Library could be used with matrixes and arrays you can reffer to the MSDN article Matrix Decomposition.

But I doubt that C# is a good choice when it comes to some serious math calculations.

Alexander Manekovskiy
  • 3,185
  • 1
  • 25
  • 34
0

Some possible problems:

Could it be the way that elements are accessed for multi-dimensional arrays in C#. See this earlier article.

Another problem may be that you are accessing non-contiguous memory - so not much help from cache, and maybe you're even having to fetch from virtual memory (disk) if the array is very large.

What happens to your speed when you access a whole row at a time, instead of a column? If that's significantly faster, you can be 90% sure it's a contiguous-memory issue...

Community
  • 1
  • 1
Floris
  • 45,857
  • 6
  • 70
  • 122