I'm doing a research on 2D Bin Packing algorithms. I've asked similar question regarding PHP's performance - it was too slow to pack - and now the code is converted to C++.
It's still pretty slow. What my program does is consequently allocating blocks of dynamic memory and populating them with a character 'o'
char* bin;
bin = new (nothrow) char[area];
if (bin == 0) {
cout << "Error: " << area << " bytes could not be allocated";
return false;
}
for (int i=0; i<area; i++) {
bin[i]='o';
}
(their size is between 1kb and 30kb for my datasets)
Then the program checks different combinations of 'x' characters inside of current memory block.
void place(char* bin, int* best, int width)
{
for (int i=best[0]; i<best[0]+best[1]; i++)
for (int j=best[2]; j<best[2]+best[3]; j++)
bin[i*width+j] = 'x';
}
One of the functions that checks the non-overlapping gets called millions of times during a runtime.
bool fits(char* bin, int* pos, int width)
{
for (int i=pos[0]; i<pos[0]+pos[1]; i++)
for (int j=pos[2]; j<pos[2]+pos[3]; j++)
if (bin[i*width+j] == 'x')
return false;
return true;
}
All other stuff takes only a percent of the runtime, so I need to make these two guys (fits and place) faster. Who's the culprit?
Since I only have two options 'x' and 'o', I could try to use just one bit instead of the whole byte the char takes. But I'm more concerned with the speed, you think it would make the things faster?
Thanks!
Update: I replaced int* pos
with rect pos
(the same for best
), as MSalters suggested. At first I saw improvement, but I tested more with bigger datasets and it seems to be back to normal runtimes. I'll try other techniques suggested and will keep you posted.
Update: using memset
and memchr
sped up things about twice. Replacing 'x' and 'o' with '\1' and '\0' didn't show any improvement. __restrict
wasn't helpful either. Overall, I'm satisfied with the performance of the program now since I also made some improvements to the algorithm itself. I'm yet to try using a bitmap and compiling with -02 (-03)... Thanks again everybody.