Speed gains: Converting 2D array to 1D array

Question

I initially had a 2D array. The results were taking time to get back the results. So, I converted the 2D array into 1D array but still there is not much improvement in speed of my program.

Here is my code:

for( counter1=0; counter1< size ; ++counter1)
    {   
       y=buffer[counter1];
       x=buffer[counter1+1];

       IndexEntries index= OneDTable[x*256+y];

    SingleTableEntries NewNextState=NewSingleTable[Next*blocksize+index];

    Next=NewNextState.Next_State;
    if(NewNextState.accept[index.serialnumber]=='1' )
    {
        return 1;           
    }

}

In my code above: OneDTable is a 1D array generated from a 2D array of 256 * 256 elements. NewSingleTable is a 1D array generated from a 2D array of blocksize* (Total Next Elements).

Actually , I was expecting large speed gains after converting into 1D arrays. Is this the right way to extract value from a 1D array or certain improvements can be done to the above code?

More Details:

Both 2D arrays are of structure type:

Structure type of IndexEntries consists of:
          int  
          int

Structure type of NewSingleTable consists of:
          int
          vector<char>

Before optimizing you should profile your code to see why it's performing badly. An Array of 256 * 256 is nothing for a computer to process. — MrFox, Apr 18 '14 at 07:46
I think you won't get speed gain because of the cc you lose on computing multiplications, here though the compiler should be using shift left logical by 8 instead of multiply which should increase total speed, but not by much since accessing time is dominant — Stack Player, Apr 18 '14 at 07:55
@StackPlayer Sorry , I didn't get . What's meant by "because of the cc you lose on computing multiplications,"? — Xara, Apr 18 '14 at 08:05
Multiplication costs many clock cycles (cc) if you go to the processor's level, however multiplication by powers of 2 can be reduced to shifting logically (sll in assembly) which is much faster — Stack Player, Apr 18 '14 at 08:28
You say that a `NewSingleTable` array element contains one integer and one vector of characters. However, you access the copied `NewNextState` using two members, both of which are integers. --- As far as one can see from the code, copying a `NewSingleTable` element (which contains a vector!) just to extract a couple of integers is bad. Please show *all* of the code of that loop, and *precise* type definitions. — laune, Apr 18 '14 at 08:30
Expecting that "rolling your own" 2D array to improve speed is not valid (in almost all cases). Much work has been invested into compilers to optimize computing adresses from array indices. Usually time is lost by a suboptimal algorithm or useless copying of information, often hidden by an innocuous statement. — laune, Apr 18 '14 at 08:35
Can you tell us what kind of improvement you expect from using a 1D array ? — bokan, Apr 18 '14 at 09:06
@bokan As I mentioned in my post, I want to increase the speed of my program or in other words make the computations fast. — Xara, Apr 18 '14 at 09:31
@Zara : of course, I understand. What speed increase do you expect ? 10%, twice faster, more ? Unrolling 2D array is very weak optimisation. It will make some gain on very small loops. But it won't be noticeable on bigger jobs. Compiler already optimize a lot, especialy for this case where you are adressing 256x256 matrix. — bokan, Apr 19 '14 at 11:55

score 2 · Accepted Answer · edited May 23 '17 at 10:32

You could gain something changing from a vector of vector to a plain vector. E.g. from:

std::vector<std::vector<my_struct>> table(total_rows,
                                          std::vector<my_struct>(total_columns,
                                                                 my_struct()));

// do something with table[row][column]...

to

std::vector<my_struct> table(total_rows * total_columns);

// do something with table[row * total_columns + column]...

This because a vector of vector is not really a matrix and you lose data locality.

Changing from:

my_struct table[total_rows][total_columns];

to

my_struct table[total_rows * total_columns];

is worthless since the memory layout between the two is (usually) precisely the same.

The only difference is the semantic type of the array and the fact that you now have to implement the 2D element lookup yourself (of course changing from table[row * 256 + column] to table[row << 8 + column] is useless since any decent compiler will automatically perform this "optimization").

The 1D array could be a bit faster when you have to perform an operation on every element. This because of the simpler for loop:

for (unsigned row(0); row < total_rows; ++row)
  for (unsigned column(0); column < total_columns; ++column)
    // do something with table[row][column]

const unsigned stop(total_rows * total_columns);
for (unsigned i(0); i < stop; ++i)
  // do something with table[i]

but this isn't your case.

As laune said in is comment, copying a NewSingleTable just to extract a couple of integers is bad:

SingleTableEntries NewNextState=NewSingleTable[Next*blocksize+index];

From your example it seems that a const reference should be enough:

...
const SingleTableEntries &NewNextState(NewSingleTable[Next * blocksize + index]);

if (NewNextState.accept[index.serialnumber] == '1' )
  return 1;           

Next = NewNextState.Next_State;
...

Please explain the last line "const reference" as I don't have much experience with c++.In my program, I have made NewSingleTable as global 1D array. — Xara, Apr 18 '14 at 10:58
I've updated the answer with the code. From your example it seems that you don't need a copy of `NewSingleTable` but just a "shortcut" to access two integer values (`NewNextState.accept[index.serialnumber]` and `NewNextState.Next_State`). — manlio, Apr 18 '14 at 11:09

Speed gains: Converting 2D array to 1D array

1 Answers1