5

Basic Question: I have a k dimensional box. I have a vector of upper bounds and lower bounds. What is the most efficient way to enumerate the coordinates of the vertices?

Background: As an example, say I have a 3 dimensional box. What is the most efficient algorithm / code to obtain:

vertex[0] = ( 0, 0, 0 ) -> ( L_0, L_1, L_2 )
vertex[1] = ( 0, 0, 1 ) -> ( L_0, L_1, U_2 )
vertex[2] = ( 0, 1, 0 ) -> ( L_0, U_1, L_2 )
vertex[3] = ( 0, 1, 1 ) -> ( L_0, U_1, U_2 )

vertex[4] = ( 1, 0, 0 ) -> ( U_0, L_1, L_2 )
vertex[5] = ( 1, 0, 1 ) -> ( U_0, L_1, U_2 )
vertex[6] = ( 1, 1, 0 ) -> ( U_0, U_1, L_2 )
vertex[7] = ( 1, 1, 1 ) -> ( U_0, U_1, U_2 )

where L_0 corresponds to the 0'th element of the lower bound vector & likewise U_2 is the 2nd element of the upper bound vector.

My Code:

const unsigned int nVertices = ((unsigned int)(floor(std::pow( 2.0, double(nDimensions)))));

for ( unsigned int idx=0; idx < nVertices; ++idx )
{
   for ( unsigned int k=0; k < nDimensions; ++k )
   {
      if ( 0x00000001 & (idx >> k) )
      {
         bound[idx][k] = upperBound[k];
      }
      else
      {
         bound[idx][k] = lowerBound[k];
      }
   }
}

where the variable bound is declared as:

std::vector< std::vector<double> > bound(nVertices);

but I've pre-sized it so as not to waste time in the loop allocating memory. I need to call the above procedure about 50,000,000 times every time I run my algorithm -- so I need this to be really efficient.

Possible Sub-Questions: Does it tend to be faster to shift by k instead of always shifting by 1 and storing an intermediate result? (Should I be using >>= ??)

M. Tibbits
  • 8,400
  • 8
  • 44
  • 59
  • How opposed would you be to poking about in assembler? And, one more thing: Profile, Profile, Profile. Take the suggestions here and test then to see where they are fast and where they are slow. – Theo Belaire Dec 07 '10 at 03:56
  • I've profiled the code using Vtune with MSVS 2005. I'm happy to work in assembler - though I compile to a variety of targets so I'm more particular to algorithmic suggestions than architecture specific, but if it really sings then all the better! -- I should note: Xeon X5550 & Core i7 930 targets primarily. – M. Tibbits Dec 07 '10 at 04:03
  • K is not fixed. I'm presently analyzing datasets with k = 25, 50, 100, 500. – M. Tibbits Dec 07 '10 at 04:03
  • Erm, doesn't dimensionality k=500 result in 2^500 = 3.27 x 10^150 vertices? Where do you get your googolbyte RAM chips? ;-) – Emile Cormier Dec 07 '10 at 04:21
  • Bah - you're right - I'm doing too many things at once. I'm doing k = 25, 50, 75, & 100. But it just occurred to me that I've been coding with unsigned int's so I'll be limited to 64 dimensions. Hmm. – M. Tibbits Dec 07 '10 at 04:28
  • Instead of generating all 2^k vertices at once, you can try computing vertices "on the fly" as needed. Runtime performance might be worse, but at least you won't need every grain of sand on the Earth for your k=500 dataset. :-) – Emile Cormier Dec 07 '10 at 05:12
  • 2
    Also, if you use size_t instead of unsigned int, you'll be able to access as many elements as your system's memory can hold. – Emile Cormier Dec 07 '10 at 05:22

1 Answers1

4

It will probably go faster if you can reduce conditional branching:

bound[idx][k] = upperLowerBounds[(idx >> k) & 1][k];

You might improve things even more if you can interleave the upper and lower bounds in a single array:

bound[idx][k] = upperLowerBounds[(k << 1) | (idx >> k)&1];

I don't know if shifting idx incrementally helps. It's simple enough to implement, so it's worth a try.

Marcelo Cantos
  • 181,030
  • 38
  • 327
  • 365