I'm trying to write a piece of code in c++ (vs2010) that will run in parallel mode using openMP. Everything runs perfectly (all of my processors are busy and for loop progress is as expected, but when I reach time step i = 211
everything slows down. In process monitor I see that I'm using only 14%), then after a while it speeds up again but again slow down on time step i = 316
. It does that periodically until it finishes. I'm not sure what is going on. As I'm new to this please forgive me if my question isn't clear enough.
This is the code:
xyQadrant is a vector created earlier in the code - it contains structures!
Methods GetUx(..), GetUY(..), CalcVelocity(...), CalcDisplacement(..)
use locking and unlocking when accessing data so there shouldn't be any issues with multiple access of the same data by multiple threads.
for(int i = 0; i < 1250; i++)
{
#pragma omp parallel num_threads(numCPU) shared(xyQuadrant)
{
#pragma omp for
for(int j = 0; j < xyQuadrant.size(); j++)
{
SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_HIGHEST);
for(int k = 0; k < xyQuadrant[j].qPoints.size(); k++)
{
point* targetPoint = &xyQuadrant[j].qPoints[k];
double strainX = 0;
double strainY = 0;
for (int n = 0; n < targetPoint->family.size(); n++)
{
point* curPoint = targetPoint->family[n];
if(fabs(curPoint->x - targetPoint->x) < 0.000001)
{
strainX = 0;
}
else
{
double directionX = (curPoint->GetUX(i-1) - targetPoint->GetUX(i-1) + curPoint->x - targetPoint->x)/fabs((curPoint->GetUX(i-1) - targetPoint->GetUX(i-1) + curPoint->x - targetPoint->x));
double stretchX = fabs((curPoint->GetUX(i-1) - targetPoint->GetUX(i-1) + curPoint->x - targetPoint->x - fabs((curPoint->x - targetPoint->x))))/fabs(curPoint->x - targetPoint->x);
strainX += directionX*c*stretchX*targetPoint->volumeCorrect[n]*(targetPoint->surfaceCorrectX + curPoint->surfaceCorrectX)/2;
}
if(fabs(curPoint->y - targetPoint->y) < 0.000001)
{
strainY = 0;
}
else
{
double directionY = (curPoint->GetUY(i-1) - targetPoint->GetUY(i-1) + curPoint->y - targetPoint->y)/fabs((curPoint->GetUY(i-1) - targetPoint->GetUY(i-1) + curPoint->y - targetPoint->y));
double stretchY = fabs((curPoint->GetUY(i-1) - targetPoint->GetUY(i-1) + curPoint->y - targetPoint->y - fabs((curPoint->y - targetPoint->y))))/fabs(curPoint->y - targetPoint->y);
strainY += directionY*c*stretchY*targetPoint->volumeCorrect[n]*(targetPoint->surfaceCorrectY+curPoint->surfaceCorrectY)/2;
}
}
targetPoint->aX = strainX*deltaV/density;
targetPoint->aY = strainY*deltaV/density;
targetPoint->CalcVelocity(deltaT);
targetPoint->CalcDisplacement(deltaT,i-1);
}
}
}
}
On the final note: I use i7-3770 processor (4 proc 8 threads)- when everything slows down I can see only 4 threads working and other 4 it says CPU parked!