3

I have a function that looks something like the code below. Its purpose is to take triangle facets one at a time from an array of points where each three points is one facet, and tessellate them, replacing the facet with a list of smaller facets whose side lengths don't exceed nodeSize.

Naturally, this function is time consuming for any realistic facet mesh. I'd like to refactor it to use some coarse parallelization. However, Parallel.For doesn't appear to have a way to step through indices in an array at intervals while preserving the index number.

Bearing in mind that the SplitTriangle function inside the loop is computationally not conducive itself to parallelization, how could I refactor this function?

Protected Shared Function SplitTriangles(Points As IEnumerable(Of Point3D), nodeSize As Single) As List(Of Point3D)
    Dim resultList As New List(Of Point3D)

        For i As Integer = 0 To Points.Count - 1 Step 3
            resultList.AddRange(SplitTriangle(Points(i), Points(i + 1), Points(i + 2), nodeSize * 4))
        Next

    Return resultList
End Function
Rob Perkins
  • 3,088
  • 1
  • 30
  • 51
  • http://stackoverflow.com/questions/7142446/parallel-for-step-size – Steve Nov 27 '12 at 21:37
  • Not a duplicate question, user574632. I'm also asking what to do about the fact that List isn't thread-safe. – Rob Perkins Nov 27 '12 at 22:47
  • Are you willing to accept a C# solution to the problem (it doesn't have to be implemented in C#, but I'd rather put it in C# and then convert it to VB.NET, easier for me initially) – casperOne Nov 30 '12 at 18:32
  • Also, you say "an array of points", but pass an `IEnumerable`; these are two separate things. Are you working with an `ICollection`? If so, it provides some more options. It seems you must be, because your call is to `Enumerable.Count`, which means that if it *wasn't* an `ICollection`, you'd iterate through all the points just getting the count, and then iterate through them again. – casperOne Nov 30 '12 at 18:35
  • I am willing to accept a C# solution. "Array of points" is meant in the conversational sense. – Rob Perkins Nov 30 '12 at 18:58
  • The incoming IEnumerable is `List`. – Rob Perkins Nov 30 '12 at 19:05
  • I marked an answer but I'd still be interested in your ideas, @casperOne – Rob Perkins Dec 03 '12 at 23:00

3 Answers3

2

I think the easiest solution here is to first go over the points and split them into an array of 3-point groups. Then you can use Parallel.For on that array.

EDIT: Since you have millions of points and do this all the time, you should do something else.

First, make sure your Points container allows easy random access (use an array or a List). Then do this:

  • Allocate the resultList with the proper size.
  • Divide Points into several parts ('several' can be hard to estimate, you should play with this a little bit). Say your list has 12,000,000 points, so resultList is 4,000,000 element long. And say you decide 4 parts is the optimal split.
    • Each part has to be consecutive (0-3M, 3M-6M, 6M-9M, 9M-12M).
    • Finding the optimal split is not easy, but trivial splits might prove good enough, so don't worry about it for now.
  • Have 4 threads, each processing one part (you can use the Task API, it would make the code clearer, in my opinion, than a Parallel.For in this case.

NOTE about thread safety:

I'm not 100% convinced a List<Point> is thread safe when you use it as a fixed-size array. It should be, but if you want to be 100% sure use an array.

casperOne
  • 73,706
  • 19
  • 184
  • 253
zmbq
  • 38,013
  • 14
  • 101
  • 171
  • My question would then be about how to parallelize the loop that builds the facet array you're talking about, since that is another one of those loops elsewhere in the application I'm writing. – Rob Perkins Nov 27 '12 at 21:31
  • I wouldn't bother with this, unless you have millions of points, and you do this *all the time*. – zmbq Nov 27 '12 at 21:33
  • Could I let the TPL partitioner choose the correct number of parts? – Rob Perkins Nov 28 '12 at 10:27
  • I doubt it. It depends on the amount of cores on your machine, and also on cache sizes. Try a few settings to get the feel of it, chances are you won't need to get into cache size issues. – zmbq Nov 28 '12 at 12:00
  • Hmm. I expected that TPL had an optimizing hill climber for thread allocation. In any case, I'm targeting a "best configuration" which has six cores and i7 or Xeon cache sizes, and testing on a 2-proc VM with four cores per proc. I'm actually free to specify that people should lump it if they don't have a very capable kit. – Rob Perkins Nov 28 '12 at 19:09
  • TPL can't possibly know if the overhead of a new task isn't worth creating a new task. It also has no knowledge of memory usage patterns and how they affect multithreaded performance. – zmbq Nov 28 '12 at 20:00
  • TPL doesn't know, but the ThreadPool still uses optimization for scheduling. At least that's what I saw in the documentation. I don't think I'm at the level where cache pagefaults can be avoided; all my data sets outstrip cache by several orders of magnitude. – Rob Perkins Nov 28 '12 at 22:12
1

Your code will be much simpler if instead of passing an IEnumerable of Points that, by convention, you treat as triangles, you pass the triangles directly.

If you convert the points to triangles, you can write the following with PLINQ:

    Function SplitTriangles(triangles As IEnumerable(Of Triangle3D), nodeSize As Single) As List(Of Triangle3D)
    Dim resultList As New List(Of Point3D)

    Dim results = (From triangle In triangles.AsParallel()
                   From newTriangle In SplitTriangle(triangle.A, triangle.B, triangle.C)
                   Select newTriangle).ToList()

    Return results
End Function

AsParallel doesn't preserver ordering but you can enforce this by adding .AsOrdered after AsParallel

Using Parallel.For requires the use of the overload that accepts local initializers and finalizers, to collect the results of each computation. The final code is much more complex for no benefit in this case.

The problem with IEnumerable(Of Point3) is that there is a very strong relation between the points of each triplet. Parallel.For/Foreach are suited for lists where each item is independent of the others.

You can convert the initial list of points to a list of triangles using an adaptation of Marc Gravell's Partition method here, to group points in triplets and return a triangle instead of an IEnumerable.

A simpler but less generic solution would be to create an iterator that returns Triangles from Points:

 Iterator Function Triangulize(points As IEnumerable(Of Point3D)) As IEnumerable(Of Triangle3D)

    Dim count As Integer = 0

    Dim buffer(3) As Point3D

    For Each point As Point3D In points
        buffer(count) = point
        count += 1
        If count = 3 Then
            Yield New Triangle3D(buffer(0), buffer(1), buffer(2))
            count = 0
        End If
    Next

End Function
Community
  • 1
  • 1
Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
  • What you've done here is moved the stepping into a custom iterator. In order to use it in places like a WPF I would still need to reverse the `Triangulize`, because they all take collections of `Point` or `Point3D` Is that the best pattern for coarse parallelization in most cases? – Rob Perkins Nov 28 '12 at 21:10
0

You could use the Enumerable.Range() function to generate the indices for you. I'm not overly familiar with VB.NET, so I'm going to write this in C#.

Enumerable.Range(0, Points.Count / 3).AsParallel().ForAll( loopVar => {
    var i = loopVar * 3;
    resultList.AddRange(SplitTriangle(Points(i), Points(i + 1), Points(i + 2), nodeSize * 4))
});

UPDATE I think this version would be thread safe, but you should check to ensure that it is and that the results are in the correct order.

resultList = Enumerable.Range(0, Points.Count / 3).AsParallel().SelectMany( loopVar => {
    var i = loopVar * 3;
    return SplitTriangle(Points(i), Points(i + 1), Points(i + 2), nodeSize * 4);
});
Brian Reischl
  • 7,216
  • 2
  • 35
  • 46