0

I have a class that goes something like:

class node{// case 1
    float points[maxCap][d];
    ...
}

I can also do it like:

class node{// case 2
    float** points;

    node(){
        points = (float**)malloc(maxCap*sizeof(float*));
        if(points)
            for(int i=0; i<maxCapacity; i++)
                points[i] = (float*)malloc(d*sizeof(float));
        else{
            cout<<"Unable to get memory"<<endl;
            exit(1);
        }
    }
    ...
}

They are basically nodes in a tree. I'm creating approximately 500'000 to 1'000'000 of these.

When I search for a point, with every single thing in the search algorithm being the same, case 2 comes out to be approximately 0.2 seconds slower than case 1 (taking average over 3 runs -- although the times are more or less the same for all the 3 runs). Time for case 1 is approx 0.88s while that for case 2 is approx 1.07s. Can someone please tell me what's going on here? Shouldn't this be approximately the same?

Jabberwocky
  • 48,281
  • 17
  • 65
  • 115
Ankit Kumar
  • 1,145
  • 9
  • 30
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/208575/discussion-on-question-by-ankit-kumar-array-vs-pointers). – Samuel Liew Feb 26 '20 at 13:37
  • 1
    Please don't allocate memory in that way. If you want a 2D array, and for whatever reason you want to allocate it "by hand", and the number of columns is uniform, then [this is much better](https://stackoverflow.com/questions/21943621/how-to-create-a-contiguous-2d-array-in-c/21944048#21944048) for a whole host of reasons. – PaulMcKenzie Feb 26 '20 at 15:23

2 Answers2

4

Unfortunately, you didn't provide the code that does the actual searching. But, assuming that the search actually needs to access the data within/behind points, the problem is clear:

  • With the array, the data is stored right in the node object, and simple pointer arithmetic is performed to deduce the location of points[a][b]. Only a single memory access is needed to actually fetch the value from memory.

  • With the pointer approach, the node only contains the address of where an array of addresses is stored. So, points loads a pointer value from memory, points[a] loads a second pointer value from memory, and points[a][b] finally loads the actual value into the CPU, ready to be compared. That's three memory accesses where a single one sufficed in the array case.

Even if your cache and prefetching does a good job at mitigating the impact of the extra memory accesses, a single memory access easily outperforms three.

cmaster - reinstate monica
  • 38,891
  • 9
  • 62
  • 106
1

Consider points[1][2].

There are twice as many pointers followed with float** points; than float points[10][3];.

As a sketch, it is:

float* inner = *(points + 1);
float result = *(inner + 2);

As opposed to

float result = *(points + (3 * 2) + (1));
Caleth
  • 52,200
  • 2
  • 44
  • 75