Im trying to analyze algorithm and I would like to improve some parts. Now I discover (in Intel VTUnes) that one function have back-end bound. Body of this function is:
int i, j;
// copy original population to temporary area
for (i = 0; i < population_size; i++)
{
for (j = 0; j < problem_size; j++)
{
ffa_tmp[i][j] = ffa[i][j];
}
}
// generational selection in sense of EA
for (i = 0; i < population_size; i++)
{
for (j = 0; j < problem_size; j++)
{
ffa[i][j] = ffa_tmp[Index[i]][j];
}
}
ffa
and ffa_tmp
are dynamically allocated 2D arrays of doubles. Index
is array of integeres. It is used for ordering.
The worst part by analyzer is ffa[i][j] = ffa_tmp[Index[i]][j];
If I understand correctly, I could improve this with vectorization. Or there is another solution?