0

I've created a lower triangular distance matrix (because of size issues) as jagged array Note: Distances between objects are symmetric

var dm = new double[size][]
for (var i = 0; i < size; i++)
{
   dm[i] = new double[i+1];
   for (var j = 0; j < i+1; j++)
   {
      dm[i][j] = distance(data[i], data[j]);
   }
 }

I need to access this matrix very often so I made the following method for it

private double GetValueOfDM(int row, int column, double[][] dm)
{
    return column <= row ? distanceMatrix[row][column] : distanceMatrix[column][row];
}

With the Visual Studio performance analysis one sees the major speed issue lies in the only row of the GetValueOfDM method.

Has someone a good idea how to speed this up?

user2653422
  • 1,095
  • 1
  • 12
  • 18
  • why are you checking `column<=row` and flipping it? certain items would be inaccessible. instead of checking, you should already know (before this method is called) what row/column you should be using. if you don't, something else in your code is illogical. – DLeh Jun 18 '15 at 17:37
  • @DLeh Sorry I wasn't clear about that. The distance between two objects is symmetric. I only have a lower triangular distance matrix avoiding the duplication of distances. I perform the check to access the correct cell in the matrix. – user2653422 Jun 18 '15 at 17:39
  • in that case, to increase performance you could make it a multi-dimensional array and remove that branching. This would trade off memory usage for performance – DLeh Jun 18 '15 at 17:40
  • 1
    @DLeh Multi-dimensional arrays are slower than jagged arrays in .NET: http://stackoverflow.com/questions/468832/why-are-multi-dimensional-arrays-in-net-slower-than-normal-arrays – Dai Jun 18 '15 at 17:41
  • @dai, okay, then leave it as jagged, but now you can remove the branching – DLeh Jun 18 '15 at 17:41
  • @DLeh If I delete the branching I can only store half of the distance matrix with my limited main memory. Saving parts of it to disk or recomputing the distance when accessed would make things really slow. That was the reason why I'm asking if there could be a better implementation under the main memory restriction. – user2653422 Jun 18 '15 at 17:45
  • 1
    Accessing array by index is O(1) ....the analysis might show you in relative terms. I tell you what, DONT DO A PREMATURE OPTIMAZATION. Your code is just fine – Dan Hunex Jun 18 '15 at 17:48
  • 1
    Try writing out if then else, I can vaguely remember being bitten by the ?: operator not getting inlined on all .Net frameworks. If anything it should point out the offending line. Also check debug vs release, could be that. Also, are you sure precomputing and looking up the distanceson to/ from the heap is less expensive than doing basic trig on the variables on the stack? @Dan: it's a free country, people are allowed to optimize their code if want need it to be faster. He ran a profiler and it pointed out this line. – gjvdkamp Jun 18 '15 at 18:07
  • @gjvdkamp Thank you good advise. The result is the comparison takes only a small amount of time whereas the return line and thus the lookup in the matrix takes the big rest. I can't measure a big difference between ? operator and writing it out. – user2653422 Jun 18 '15 at 18:55

3 Answers3

2

I'm guessing you're using this in a tight-loop? Arrays in .NET aren't /that/ fast because of automatic bounds-checking. If you need fast array perf use a pointer with a buffer:

sealed unsafe class DistanceData : IDisposable {
    private Double* buffer;
    private IntPtr  bufferLength; // .NET uses IntPtr as a size_t equivalent.
    private Int32   dim0Length;

    public DistanceData(Int32 size, Double[] data) {
        this.buffer       = (Double*)Marshal.AllocHGlobal( size * size );
        this.bufferLength = size * size;
        this.dim0Length   = size;

        for(int y = 0; y < size; y++) {
            for(int x = 0; x < y + 1; x++) {
                this.buffer[ y * this.dim0Length + x ] = Distance( data[y], data[x] );
            }
        }
    }

    public void Dispose() {
        Marshal.FreeHGlobal( this.buffer );
    }

    public Double GetValueOfDM(Int32 row, Int32 column) {
        // WARNING: Without validation or your own bounds-checking, invalid values of `row` and `column` will cause access-violation errors and crash your program. Ensure that code that calls `GetValueOfDM` is correct and will never submit invalid values.
        return this.buffer[ row * this.dim0Length  + column];
    }
}
Dai
  • 141,631
  • 28
  • 261
  • 374
  • I'm trying to use your example. I get a cast error for `this.bufferLength = size * size;` so I changed this to `this.bufferLength = (IntPtr) (size * size);`. Running the code I get an `AccessViolationException` when y reaches 1250 and x 472. It seems that the allocated memory for `buffer` isn't enough. Notice that in the end the matrix should contain (20000x20000)/2 double values. Do I have to change the Marshal allocation? – user2653422 Jun 18 '15 at 18:36
  • @user2653422 Where is the AccessViolation being thrown from exactly? – Dai Jun 18 '15 at 18:43
  • At the line `this.buffer[ y * this.dim0Length + x ] = Distance( data[y], data[x] ); ` – user2653422 Jun 18 '15 at 18:45
  • @user2653422 Set a breakpoint on that line and evaluate the actual offset being used with the buffer pointer and compare that with the size of the buffer. – Dai Jun 18 '15 at 18:48
  • I don't know if I understand you correctly. If I set a breakpoint to that line and look at `this.buffer` by pointing the mouse at it I get `0x000000001e980040`. But this value does not change throughout the iteration of the for loop. So I guess I'm not doing what you've requested. Could you please explain it in more detail? – user2653422 Jun 18 '15 at 19:11
  • If I understand the `AllocHGlobal` method correctly it will allocate memory with the given amount of bytes. In my case it allocates `size*size` or 400 million bytes. A double uses 8 byte. As I said the matrix contains (20000x20000)/2 double values or 1600 million bytes. Is that correct? In that case the allocated memory by `size*size` isn't enough. – user2653422 Jun 18 '15 at 19:26
  • Ok multiplying the memory to allocate by 8 I get your code to run. However, it's not faster than the previous code. Ignoring small differences, because of the cpu, the time needed is nearly same (measured by Stopwatch class). – user2653422 Jun 18 '15 at 19:45
1

You could remove the conditional in the method and increase memory usage to increase access performance like so:

var dm = new double[size][];
for (var i = 0; i < size; i++)
{
   dm[i] = new double[size];
   for (var j = 0; j < i+1; j++)
   {
      dm[i][j] = distance(data[i], data[j]);
      dm[j][i] = dm[i][j];
   }
 }

private double GetValueOfDM(int row, int column, double[][] dm)
{
    return dm[row][column];
}

Now that you don't have a conditional, the compiler can remove a branch prediction. Also, you should run tests with your actual use cases to ensure that it is actually going to be a problem. Analysis would probably reveal that a branching conditional will be the slowest part of your code, but it doesn't necessarily mean that it's actually going to slow anything down noticeably. In addition, you could try running it in Release mode (with compiler optimizations) to see how it affects performance.

If you are on a system where you don't have the memory available to double the size of the array, then the code you have is probably close to optimal for accessing a jagged array.

DLeh
  • 23,806
  • 16
  • 84
  • 128
  • Checked it with the profiler. Avoiding the conditional speed thinks up a little bit, but not much. But running in release mode speed thinks up by 0.20-0.30%. So I will accept your answer, because you and @gjvdkamp (voted up) pointed out to use the release mode. – user2653422 Jun 18 '15 at 19:52
  • the first part will take longer, so ensure that your testing is actually testing only the speed of the method calls. I wouldn't expect a huge change by removing the branch, but it could be a bit more than you're seeing if you're also timing the creation (which will only happen once, so it can be a bit slower) – DLeh Jun 18 '15 at 19:54
  • I've tested with stopwatch class and measured the time for my method which uses the `GetValueOfDM` method (so after the matrix was constructed). I also use the visual studio profiler which shows only 2% of the time for the conditional but 24% for the matrix access and return. – user2653422 Jun 18 '15 at 19:59
  • okay, just checking. this code should be close to optimal without doing scary things with pointers like in Dai's answer, which very well could be faster. – DLeh Jun 18 '15 at 20:01
1

You could use a one-dimensional array and calculate the index like this

i = (r * r + r) / 2 + c;

But you still have to check for r <= c and do the flipping. (r=row, c=column)

But will this really be faster?

Olivier Jacot-Descombes
  • 104,806
  • 13
  • 138
  • 188
  • Would think that looking it up once in a tight array would be one less pointer into the dark to chase.. – gjvdkamp Jun 18 '15 at 18:17