I'm trying to speed up an algorithm that I have written in C#. One of the first thing that I have thought is to make it parallel.
The algorithm have to run over a lot (~Millions) of 2D segments, each segment is independent from the others.
Here is the code:`
private void DoMapping(Segment[] image, CancellationToken ct, int numTasks = 3)
{
long time = Environment.TickCount;
LaserOutput = new List<Vector3[]>();
NormalsOutput = new List<Vector3>();
Task< Tuple < List<Vector3[]>, List < Vector3 >>>[] tasks = new Task<Tuple<List<Vector3[]>, List<Vector3>>>[numTasks];
int perTaskSegments = image.Length / numTasks;
for (int taskIndex = 0; taskIndex < tasks.Length; taskIndex++)
{
int nseg = perTaskSegments * (taskIndex + 1) + (taskIndex == tasks.Length - 1 ? image.Length % tasks.Length : 0);
int from = perTaskSegments * taskIndex;
Tuple<int, int, Segment[], CancellationToken> obj = new Tuple<int, int, Segment[], CancellationToken>(from, nseg, image, ct);
tasks[taskIndex] = Task.Factory.StartNew(DoComputationsAction, obj, CancellationToken.None, TaskCreationOptions.LongRunning, TaskScheduler.Default);
}
Task.WaitAll(tasks);
for (int taskIndex = 0; taskIndex < tasks.Length; taskIndex++)
{
LaserOutput.AddRange(tasks[taskIndex].Result.Item1);
NormalsOutput.AddRange(tasks[taskIndex].Result.Item2);
}
}
private Tuple<List<Vector3[]>, List<Vector3>> DoComputationsAction(object obj)
{
Tuple<int, int, Segment[], CancellationToken> parm = obj as Tuple<int, int, Segment[], CancellationToken>;
List<Vector3[]> tmpLaser = new List<Vector3[]>();
List<Vector3> tmpNormals = new List<Vector3>();
bool errorOccured = false;
for (int segCounter = parm.Item1; segCounter < parm.Item2 && !errorOccured; segCounter++)
{
if (parm.Item4.IsCancellationRequested)
break;
try
{
var res = SplitOverMap(parm.Item3[segCounter], (string error) => {
errorOccured = true;
MessageBox.Show(error, "An error occured", MessageBoxButtons.OK, MessageBoxIcon.Error);
Logger.Log("An error occured while mapping data to 3d.");
});
if (res != null)
{
tmpLaser.AddRange(res.Item1);
tmpNormals.AddRange(res.Item2);
}
}
catch (Exception e)
{
Logger.Log("An error occured while calculating 3d map. Skipping polyline." + e.Message);
}
}
return new Tuple<List<Vector3[]>, List<Vector3>>(tmpLaser, tmpNormals);
}`
Inside SplitOverMap a query to a spatial data structure (QTree) is performed, then some computations occurs.
No locks are perfomed during the whole process. No Disk is used.
Do you have any suggestions on what could be causing the cpu to reach only 40-60 usage?
I also have tried to change num task to 4, 6 and 8. There are no major changes.
I am thinking about the GC but there is not a whole lot I can do to prevent it from running.
EDIT: By reducing the memory usage of some of the classes I have managed to improve a little bit the cpu usage, now it runs around 70%.
On the other hand, by raising the level treshold of the QuadTree I have obtained substantial performance improvement.