2

I have a large, dynamically generated tree, in which I am interested in only the leaves. As the generation of new nodes only depend on the parent node, I want to do as much as possible in parallel, so here is my plan:

  • Starting up n threads
  • Each thread takes one node from the top of a ConcurrentStack
  • Each thread generates the child nodes and puts them on the ConcurrentStack, unless it is a leave (has reached the maximum depth of the tree)
  • Going back to step 2, until the stack is empty, and all threads are waiting. Then the thread should end.

Well, as I know the depth and branching factor of the tree, it is possible to calculate the number of nodes in the tree and so I want to display a progress bar.

To do this, I thought about using a common static variable, that I lock each time, increment it, unlock it again. In the display thread I think I don't need to lock it, as it is not critical to have the latest number and I don't base calculations on it. But somehow I have a bad feeling about it. Is my plan safe? I am quite new to parallel computing.

New Approach I know how many Threads are running, and just approximate the number of processed nodes by multiplying the thread number with the count of nodes one of those threads processed. The result wouldn't be accurate, but maybe accurate enough, and it would safe me from locking a common variable.

My main focus lies on monitoring the progress and counting the processed nodes, though I appreciate all comments that help to improve my knowledge and skills regarding parallel computation.

Thanks in advance, -M-

  • 2
    This plan is unlikely to result in significant parallelism. There's two main problems - one, the overhead of managing the concurrent stack is likely bigger than whatever work you spend computing, and two, it screws up with memory caching. Locking on a common static variable like this will be just another bottleneck in your application. What are you actually trying to do? Do you *need* paralellism? How much work is associated with each node (I assume the answer is "hardly any")? How many nodes are there? – Luaan Aug 25 '15 at 09:27
  • 2
    possible duplicate of [Use threads when traversing a tree](http://stackoverflow.com/questions/10032189/use-threads-when-traversing-a-tree) – Richard Schneider Aug 25 '15 at 09:32
  • The Setting is: Moving between a number of waypoints, and calculating the distance (distance so far + distance between the two last waypoints) and the cost (cost so far * 1/distance between the two last waypoints) All of this with an branchingfactor aroung 100 - 200 (which is, why I want to use a depth first search, to keep the tree in the memory as small as possible) – derM - not here for BOT dreams Aug 25 '15 at 09:36
  • @RichardSchneider: The main focus of my question is, how to monitor the progress/how to count the number of processed nodes, though thoughts on the rest are welcome, too. So thank you very much for the link to this interesting q/a! – derM - not here for BOT dreams Aug 25 '15 at 09:42
  • ConcurrentStack is thread safe but not parallel. Only one thread can write to it at the same time. Efficient parallelism requires little inter-thread coordination. – usr Aug 25 '15 at 10:03
  • Can you comment on the structure of the tree? Is it regular? Is it super deep? How much work in seconds will you perform on each node? – usr Aug 25 '15 at 10:04
  • @usr: It is rather very broad than very deep, though it is desireable to get it as deep as possible. The max depth the user may choose is limited by the memory, as the high branching factor makes it grow really fast. The work per node is not too much. It is the creaton of x nodes (x ~ 200) and for each creation there is 2 assignments, 1 division 1 multiplication and one addition. Larger values for x might be possible, depending on the patience of the user. – derM - not here for BOT dreams Aug 25 '15 at 10:11
  • 2
    OK, there is a simple strategy to run such trees. Split off new tasks for each nor up to a certain max depth (e.g. 3). Then, run the subtree sequentially. There is almost zero coordination cost for this. You also need to enable server GC or else all those allocations will not scale. The workstation GC does not scale. – usr Aug 25 '15 at 10:25

0 Answers0