5

UPDATE: To help clarify what I'm asking I have posted a little java code that gets the idea across.

A while ago I asked a question on how to get an algorithm to break down a set of numbers, the idea was to give it a list of numbers (1,2,3,4,5) and a total(10) and it would figure out all the multiples of each number that would add up to the total('1*10' or '1*1,1*2,1*3,1*4' or '2*5',etc..). It was the first programming exercise I ever did so it took me a while and I got it working but now I want to try to see if I can scale it. The person in the original question said it was scalable but I'm a bit confused at how to do it. The recursive part is the area I'm stuck at scaling the part that combines all the results(the table it is referring to is not scalable but applying caching I am able to make it fast)

I have the following algorithm(pseudo code):

//generates table
for i = 1 to k
    for z = 0 to sum:
        for c = 1 to z / x_i:
            if T[z - c * x_i][i - 1] is true:
                set T[z][i] to true

//uses table to bring all the parts together
function RecursivelyListAllThatWork(k, sum) // Using last k variables, make sum
    /* Base case: If we've assigned all the variables correctly, list this
     * solution.
     */
    if k == 0:
        print what we have so far
        return

    /* Recursive step: Try all coefficients, but only if they work. */
    for c = 0 to sum / x_k:
       if T[sum - c * x_k][k - 1] is true:
           mark the coefficient of x_k to be c
           call RecursivelyListAllThatWork(k - 1, sum - c * x_k)
           unmark the coefficient of x_k

I'm really at a loss at how to thread/multiprocess the RecursivelyListAllThatWork function. I know if I send it a smaller K( which is int of total number of items in list) it will process that subset but I don't know how to do ones that combine results across the subset. For example, if list is [1,2,3,4,5,6,7,8,9,10] and I send it K=3 then only the 1,2,3 get processed which is fine but what about if I need results that include 1 and 10? I have tried to modify the table(variable T) so only the subset I want are there but still doesn't work because, like the solution above, it does a subset but cannot process answers that require a wider range.

I don't need any code just if someone can explain how to conceptually break this recursive step to so other cores/machines can be used.

UPDATE: I still can't seem to figure out how to turn RecursivelyListAllThatWork into a runnable(I know technically how to do it, but I don't understand how to change the RecursivelyListAllThatWork algorithm so it can be ran in parallel. The other parts are just here to make the example work, I only need to implement runnable on RecursivelyListAllThatWork method). Here's the java code:

import java.awt.Point;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;


public class main
{
    public static void main(String[] args)
    {
        System.out.println("starting..");
        int target_sum = 100;
        int[] data = new int[] { 10, 5, 50, 20, 25, 40 };
        List T = tableGeneator(target_sum, data);
        List<Integer> coeff = create_coeff(data.length);
        RecursivelyListAllThatWork(data.length, target_sum, T, coeff, data);
    }

    private static List<Integer> create_coeff(int i) {
        // TODO Auto-generated method stub
        Integer[] integers = new Integer[i];
        Arrays.fill(integers, 0);
        List<Integer> integerList = Arrays.asList(integers);
        return integerList;
    }


    private static void RecursivelyListAllThatWork(int k, int sum, List T, List<Integer> coeff, int[] data) {
        // TODO Auto-generated method stub
        if (k == 0) {
            //# print what we have so far
            for (int i = 0; i < coeff.size(); i++) {
                System.out.println(data[i] + " = " + coeff.get(i));
            }

            System.out.println("*******************");
            return;
        }

        Integer x_k = data[k-1];
        //  Recursive step: Try all coefficients, but only if they work. 
        for (int c = 0; c <= sum/x_k; c++) { //the c variable caps the percent
            if (T.contains(new Point((sum - c * x_k), (k-1))))
            {
                    // mark the coefficient of x_k to be c
                    coeff.set((k-1), c);
                    RecursivelyListAllThatWork((k - 1), (sum - c * x_k), T, coeff, data);
                    // unmark the coefficient of x_k
                    coeff.set((k-1), 0);
            }

        }

    }

    public static List tableGeneator(int target_sum, int[] data) {
        List T = new ArrayList();
        T.add(new Point(0, 0));

        float max_percent = 1;
        int R = (int) (target_sum * max_percent * data.length);
        for (int i = 0; i < data.length; i++)
        {
            for (int s = -R; s < R + 1; s++)
            {
                int max_value = (int) Math.abs((target_sum * max_percent)
                        / data[i]);
                for (int c = 0; c < max_value + 1; c++)
                {
                    if (T.contains(new Point(s - c * data[i], i)))
                    {
                        Point p = new Point(s, i + 1);
                        if (!T.contains(p))
                        {
                            T.add(p);
                        }
                    }
                }
            }
        }
        return T;
    }
} 
Community
  • 1
  • 1
Lostsoul
  • 25,013
  • 48
  • 144
  • 239
  • What is the reason for "max_percent = 1" in tableGenerator method ? – Yves Martin Apr 06 '12 at 21:06
  • @YvesMartin its used to cap the results so nothing exceeds 100%, but ignore the specifics of tableGenerator Method, its only there to provide a table to show you how RecursivelyListAllThatWork works..So given that table, can RecursivelyListAllThatWork be threaded? – Lostsoul Apr 06 '12 at 22:01
  • Are you aware the table generator finds already solution to the problem without any recursion but by trying all combination ? This job is really costly and finally there is no use for RecursivelyListAllThatWork. I have publish a "true" recursive implementation as answer to the original question. You misunderstand what "T" means in the pseudo-code - it is a predicate which answer "true" if the answer is solve. – Yves Martin Apr 07 '12 at 05:49
  • @YvesMartin I thought the T meant if the value was solvable..and then I needed RecursivelyListAllThatWork in order to bring all the results together? – Lostsoul Apr 07 '12 at 18:33
  • OK I understand now what happens. The generated table is used to cut a lot of recursive calls which do not lead to a solution. – Yves Martin Apr 07 '12 at 20:40
  • I am curious about the aim of that algorithm - it is closed to crypto key attack and hash function collision attack. Why items may be negative ? Are these items all prime numbers ? – Yves Martin Apr 09 '12 at 20:03

3 Answers3

8

The general answer to multi-threading is to de-recursivate a recursive implementation thanks to a stack (LIFO or FIFO). When implementing such an algorithm, the number of threads is a fixed parameter for the algorithm (number of cores for instance).

To implement it, the language call stack is replaced by a stack storing last context as a checkpoint when the tested condition ends the recursivity. In your case it is either k=0 or coeff values matchs targeted sum.

After de-recursivation, a first implementation is to run multiple threads to consume the stack BUT the stack access becomes a contention point because it may require synchronization.

A better scalable solution is to dedicate a stack for each thread but an initial production of contexts in the stack is required.

I propose a mix approach with a first thread working recursively for a limited number of k as a maximum recursion depth: 2 for the small data set in example, but I recommend 3 if larger. Then this first part delegates the generated intermediate contexts to a pool of threads which will process remaining k with a non-recursive implementation. This code is not based on the complex algorithm you use but on a rather "basic" implementation:

import java.util.Arrays;
import java.util.ArrayDeque;
import java.util.Queue;
import java.util.concurrent.ConcurrentLinkedQueue;
import java.util.concurrent.LinkedBlockingDeque;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;

public class MixedParallel
{
    // pre-requisite: sorted values !!
    private static final int[] data = new int[] { 5, 10, 20, 25, 40, 50 };

    // Context to store intermediate computation or a solution
    static class Context {
        int k;
        int sum;
        int[] coeff;
        Context(int k, int sum, int[] coeff) {
            this.k = k;
            this.sum = sum;
            this.coeff = coeff;
        }
    }

    // Thread pool for parallel execution
    private static ExecutorService executor;
    // Queue to collect solutions
    private static Queue<Context> solutions;

    static {
        final int numberOfThreads = 2;
        executor =
            new ThreadPoolExecutor(numberOfThreads, numberOfThreads, 1000, TimeUnit.SECONDS,
                                   new LinkedBlockingDeque<Runnable>());
        // concurrent because of multi-threaded insertions
        solutions = new ConcurrentLinkedQueue<Context>();
    }


    public static void main(String[] args)
    {
        int target_sum = 100;
        // result vector, init to 0
        int[] coeff = new int[data.length];
        Arrays.fill(coeff, 0);
        mixedPartialSum(data.length - 1, target_sum, coeff);

        executor.shutdown();
        // System.out.println("Over. Dumping results");
        while(!solutions.isEmpty()) {
            Context s = solutions.poll();
            printResult(s.coeff);
        }
    }

    private static void printResult(int[] coeff) {
        StringBuffer sb = new StringBuffer();
        for (int i = coeff.length - 1; i >= 0; i--) {
            if (coeff[i] > 0) {
                sb.append(data[i]).append(" * ").append(coeff[i]).append("   ");
            }
        }
        System.out.println(sb.append("from ").append(Thread.currentThread()));
    }

    private static void mixedPartialSum(int k, int sum, int[] coeff) {
        int x_k = data[k];
        for (int c = sum / x_k; c >= 0; c--) {
            coeff[k] = c;
            int[] newcoeff = Arrays.copyOf(coeff, coeff.length);
            if (c * x_k == sum) {
                //printResult(newcoeff);
                solutions.add(new Context(0, 0, newcoeff));
                continue;
            } else if (k > 0) {
                if (data.length - k < 2) {
                    mixedPartialSum(k - 1, sum - c * x_k, newcoeff);
                    // for loop on "c" goes on with previous coeff content
                } else {
                    // no longer recursive. delegate to thread pool
                    executor.submit(new ComputePartialSum(new Context(k - 1, sum - c * x_k, newcoeff)));
                }
            }
        }
    }

    static class ComputePartialSum implements Callable<Void> {
        // queue with contexts to process
        private Queue<Context> contexts;

        ComputePartialSum(Context request) {
            contexts = new ArrayDeque<Context>();
            contexts.add(request);
        }

        public Void call() {
            while(!contexts.isEmpty()) {
                Context current = contexts.poll();
                int x_k = data[current.k];
                for (int c = current.sum / x_k; c >= 0; c--) {
                    current.coeff[current.k] = c;
                    int[] newcoeff = Arrays.copyOf(current.coeff, current.coeff.length);
                    if (c * x_k == current.sum) {
                        //printResult(newcoeff);
                        solutions.add(new Context(0, 0, newcoeff));
                        continue;
                    } else if (current.k > 0) {
                        contexts.add(new Context(current.k - 1, current.sum - c * x_k, newcoeff));
                    }
                }
            }
            return null;
        }
    }
}

You can check which thread has found outputted result and check all are involed: the main thread in recursive mode and the two thread from the pool in context stack mode.

Now this implementation is scalable when data.length is high:

  • the maximum recursion depth is limited to the main thread at a low level
  • each thread from the pool works with its own context stack without contention with others
  • the parameters to tune now are numberOfThreads and maxRecursionDepth

So the answer is yes, your algorithm can be parallelized. Here is a fully recursive implementation based on your code:

import java.awt.Point;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.ArrayDeque;
import java.util.Queue;
import java.util.concurrent.ConcurrentLinkedQueue;
import java.util.concurrent.LinkedBlockingDeque;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;

public class OriginalParallel
{
    static final int numberOfThreads = 2;
    static final int maxRecursionDepth = 3;

    public static void main(String[] args)
    {
        int target_sum = 100;
        int[] data = new int[] { 50, 40, 25, 20, 10, 5 };
        List T = tableGeneator(target_sum, data);
        int[] coeff = new int[data.length];
        Arrays.fill(coeff, 0);
        RecursivelyListAllThatWork(data.length, target_sum, T, coeff, data);
        executor.shutdown();
    }

    private static void printResult(int[] coeff, int[] data) {
        StringBuffer sb = new StringBuffer();
        for (int i = coeff.length - 1; i >= 0; i--) {
            if (coeff[i] > 0) {
                sb.append(data[i]).append(" * ").append(coeff[i]).append("   ");
            }
        }
        System.out.println(sb.append("from ").append(Thread.currentThread()));
    }

    // Thread pool for parallel execution
    private static ExecutorService executor;
    static {
        executor =
            new ThreadPoolExecutor(numberOfThreads, numberOfThreads, 1000, TimeUnit.SECONDS,
                                   new LinkedBlockingDeque<Runnable>());
    }

    private static void RecursivelyListAllThatWork(int k, int sum, List T, int[] coeff, int[] data) {
        if (k == 0) {
            printResult(coeff, data);
            return;
        }
        Integer x_k = data[k-1];
        //  Recursive step: Try all coefficients, but only if they work. 
        for (int c = 0; c <= sum/x_k; c++) { //the c variable caps the percent
            if (T.contains(new Point((sum - c * x_k), (k-1)))) {
                    // mark the coefficient of x_k to be c
                    coeff[k-1] = c;
                    if (data.length - k != maxRecursionDepth) {
                        RecursivelyListAllThatWork((k - 1), (sum - c * x_k), T, coeff, data);
                    } else {
                        // delegate to thread pool when reaching depth 3
                        int[] newcoeff = Arrays.copyOf(coeff, coeff.length);
                        executor.submit(new RecursiveThread(k - 1, sum - c * x_k, T, newcoeff, data)); 
                    }
                    // unmark the coefficient of x_k
                    coeff[k-1] = 0;
            }
        }
    }

    static class RecursiveThread implements Callable<Void> {
        int k;
        int sum;
        int[] coeff;
        int[] data;
        List T;

        RecursiveThread(int k, int sum, List T, int[] coeff, int[] data) {
            this.k = k;
            this.sum = sum;
            this.T = T;
            this.coeff = coeff;
            this.data = data;
            System.out.println("New job for k=" + k);
        }

        public Void call() {
            RecursivelyListAllThatWork(k, sum, T, coeff, data);
            return null;
        }
    }

    public static List tableGeneator(int target_sum, int[] data) {
        List T = new ArrayList();
        T.add(new Point(0, 0));

        float max_percent = 1;
        int R = (int) (target_sum * max_percent * data.length);
        for (int i = 0; i < data.length; i++) {
            for (int s = -R; s < R + 1; s++) {
                int max_value = (int) Math.abs((target_sum * max_percent) / data[i]);
                for (int c = 0; c < max_value + 1; c++) {
                    if (T.contains(new Point(s - c * data[i], i))) {
                        Point p = new Point(s, i + 1);
                        if (!T.contains(p)) {
                            T.add(p);
                        }
                    }
                }
            }
        }
        return T;
    }
}
Yves Martin
  • 10,217
  • 2
  • 38
  • 77
  • wow..this looks really good. I'm going to play around with it and let you know. The results are actually dumped to another program so I'll just disable the prints for now and just test the speed it compiles everything then I'll use the print to test results. Thank you so much. – Lostsoul Apr 07 '12 at 18:35
  • There is 2 problems(slight) so far, one this doesn't not seem to compile from command line(I copy/paste it into a virtual machine and then run javac on it but I get "Note: MixedParallel.java uses unchecked or unsafe operations.") also when I test this in eclipse executor.shutdown(); does not exist.. executor exists but no option for shutdown. – Lostsoul Apr 07 '12 at 18:37
  • I know my code does not correspond exactly to the algorithm you have implemented but it should help you to implement the derecursion and parallel execution in your own logic - but this new version will require more memory that the existing one – Yves Martin Apr 07 '12 at 21:01
  • Thanks so much for posting. Its def. very interesting and different approach(I'm going to test this with with many variables). So the problem I posted is very close to the logic I'm using but not exactly, so in the old solution, there was a line "s - c * data[i]" where I modified the logic, there a simlair area? is it mixparticalsum? For example, say I want percentages instead or to limit the results(if I have negative values),etc..Not sure where to put my own logic.. I have the logic/code just need to study this a bit to understand how to integrate it. – Lostsoul Apr 07 '12 at 23:39
  • You may remove the "solutions" queue to reduce memory footprint... I really think you can apply the concept of limited recursive depth first and then paralell execution to your algorithm. But queues with contexts will consume memory ! And I think the next problem will be to parallelize the table generator and I have no idea how to do this because each new point may depends on previous points. – Yves Martin Apr 08 '12 at 07:18
  • I'm a bit confused..does your code generate table or use one for lookup? I can't see it from your code? If it helps, the code I posted in simplified but using small arraylist lookups and a few tricks I have been able to get the table to be generated in 20-50 ms per item, so large items do not take long at all. on the memory note, I have been thinking about this all night and I'll try to modify the code so it just dumps results as they are found(instead of storing in memory for so long), do you think that could work? – Lostsoul Apr 08 '12 at 14:41
  • No, my code does not generate table. By the way, you try to support negative items - but what about a small set with (+10, -5) ? In that case, there is an infinite number of solutions. Your table generation limits to [ -target_sum * data.length, +target_sum * data.length ], is it really relevant ? – Yves Martin Apr 09 '12 at 12:41
  • Yes, its possible. What I do is set limits of how many coeff's are possible so if target_sum is 10, then I put a limit(absolute number 10) of 100%, so 10*2 or -5*3 is not possible. I'm curious because I understand your code a bit better now, is the table generator even necessary in your solution or is it better to integrate it within your solution(I created a version of the table that has the same results but each item takes about 20-50ms to add and its not exponential) but nto sure if its good to integrate it? – Lostsoul Apr 09 '12 at 13:52
  • Another weird issue, when I run it sometimes I get different results, like targetsum=10 and data (5,1,3)..first time I ran it I didn't get any results using 5, then reran it and go it. – Lostsoul Apr 09 '12 at 16:07
  • May this issue comes from concurrent issue with System.out which can produce mixed lines from different threads ? It is important to do a single println call to output a result - what I did with a StringBuffer. – Yves Martin Apr 09 '12 at 19:37
  • The code I propose in my answer is just a "basic" example how to parallelize a recursive implementation (the original one come from here http://stackoverflow.com/questions/7265576/can-brute-force-algorithms-scale/10052151#10052151) – Yves Martin Apr 09 '12 at 19:41
  • I have updated my answer with a parallelized version of your original code. – Yves Martin Apr 09 '12 at 21:29
  • Thank you so much Yves. I am learning and your explanations and code have helped me understand alot. I knew how to multi-thread apps but didn't realize I could just wrap my code like that. Is there any downside of using second version(my specific code) versus your first solution? I understand your solution does one process first and then splits the remaining K over threads while my version just keeps working until its done. Would it make sense if I took your MixedParallel and implemented my table into it? or is it already the same as the OriginalParallel? – Lostsoul Apr 10 '12 at 00:39
  • I guess what I'm trying to ask, is your first solution worked well and structurally changed my code but your second solution seems to wrap my code into a callable. Is there a difference in outcome and performance? between the two? – Lostsoul Apr 10 '12 at 00:41
  • my benchmarks(just timing both recursive functions) shows that your mixed approach is significantly faster(mixed = 1620 ms versus Original= 115290 ms for data ={1,2,3,4,5,6,7,8} and sum=100). I'm going to try to integrate the table into your code and see if it can be done even faster(since it can break down results without having to scan for possibilities since the table provides what's possible and what's not) – Lostsoul Apr 10 '12 at 02:41
  • The "MixedParallel" only works with non-negative items and is less scalable than the "OriginalParallel" because of memory required to maintain contexts in stacks. By the way, I have the intuition than some work on "table generator" method (replace for loop by while, keep track of coeff in the same process, add a test for target sum) can get solutions faster without recursion - but then it may be impossible to multi-thread. – Yves Martin Apr 10 '12 at 06:03
  • The tablegenerator is okay to me because I use alot of caching of previous results and keeping arrays as small as possible so it only takes a few ms to do each item. I think 3000 items takes like 10 minutes. My fear is the RecursivelyListAllThatWork doesn't scale very well because the more items in it the more exponential the processing becomes. I'll test it more and the multi-threading surely helps but it seems MixedParallel doesn't grow exponentially as more data is added. – Lostsoul Apr 10 '12 at 15:08
  • Thanks again Yves. I'm still converting my code to follow your threading logic but ran into a problem. I had a few control methods that limited the results for example in the old program I had something like this to limit results to 3 items: `int occurrences = Collections.frequency(Arrays.asList(newcoeff), 0); if (newcoeff.length-occurrences >= 2) { current.k = 0; current.sum = 0; return null;` but I can't seem to figure out where to put it in your code. I tried to put it within `} else if (current.k > 0) {` – Lostsoul Apr 12 '12 at 05:15
  • I have a few other control methods that basically once triggered restart the process so its skipped over, do you know where I can put them in the callable method? – Lostsoul Apr 12 '12 at 05:17
  • Another question, I've converted the code over but your code as well as my application of it, I'm running into major memory problems. As the variables hit over 100 the memory gets full and the program grinds to a halt. I disabled solutions as well as removed `array.copyof`(just to test, when I do this the results are all incorrect) but still I can't seem to get the memory under control in the mixedpararell program. I'm pretty new to Java but is this a memory leak(I can't seem to pinpoint its location)? – Lostsoul Apr 12 '12 at 19:22
  • A memory leak is usually un-intentional. In your case, you need to keep track of operations left to do... You should reconsider your requirements. Your initial multi-loop code does not require memory but it is slow and not easily multithreadable... – Yves Martin Apr 12 '12 at 20:53
  • I have basically been refactoring my entire code to follow the logic you gave me in your code(mixed). I think I figured out how to reduce the results basically add the conditions to the if statement(I added a method that checked various factors and if one of them is met then it prevents further results from being added). I'm still trying to figure out if its a memory leak or not, I don't see memory usage going down yet the context queue gets very large with more data. I'm going to experiment with using an external queue and see if it fixes the memory problem. I'll keep you posted. – Lostsoul Apr 13 '12 at 03:01
1

1) Instead of

if k == 0:
    print what we have so far
    return

you can check to see how many coefficients are non-zero; if that count is greater than a certain threshold (3 in your example), then just don't print it. (Hint: this would be closely related to the

mark the coefficient of x_k to be c

line.)

2) Recursive functions are generally exponential in nature, which means that as you scale higher, the runtime will grow sharply larger.

With that in mind, you can apply multithreading to both calculating the table and the recursive function.

When considering the table, think about which parts of the loop affect each other and must be done in sequence; the converse, of course, is finding which parts don't affect each other and can be run in parallel.

As for the recursive function, your best bet would probably be to apply the multithreading to the branching part.

hehewaffles
  • 582
  • 5
  • 17
  • Thank you for helping. In regards to part one, I had a similar logic before but was not able to get it to work successfully. I added a counter to count everytime a x_k was changed to C and then in if K ==0 I added or count = 2. This did not appear to limit the results at all(although it did impact the results because the results with only 1 variable were missing. In regards to point 2, I got the table working fast(but I cannot scale it because its dependent on previous steps) so I'm working on the recursive function. What does 'branching part' mean? is that the for loop? – Lostsoul Mar 27 '12 at 03:09
  • Anytime the recursive function calls itself, it creates another branch, so yes, the for loop is the branching part. – hehewaffles Mar 27 '12 at 15:31
  • Thanks I was wrong, your logic was correct in limiting the results. I failed when I tried because I didn't realize my counter was not getting reset in the recursion. My solution is not perfect but I think I have the foundation to play around with. Thanks for clearing it up. – Lostsoul Mar 27 '12 at 19:54
  • I updated the question to focus only on the threading because I still don't understand how to break it down. If I must thread the branch than fine but I'm sort of aiming at scaling the recursive function itself and maybe sending it different data or something so each function works on different parts(but I've hit a problem in my logic). – Lostsoul Mar 27 '12 at 19:56
1

They key to making this multithreaded is just to make sure that you don't have unnecessary global data structures, like your "marks" on the coefficients.

Let's say you have K numbers n[0] ... n[K-1] in your table and the sum you want to reach is S. I assume below that the array n[] is sorted from smallest to largest number.

A simple enumeration algorithm is here. i is index to the list of numbers, s is the current sum already built, and cs is a list of coefficients for the numbers 0 .. i - 1:

function enumerate(i, s, cs):
  if (s == S):
     output_solution(cs)
  else if (i == K):
     return // dead end
  else if ((S - s) < n[i]):
     return // no solution can be found
  else:
     for c in 0 .. floor((S - s) / n[i]): // note: floor(...) > 0
        enumerate(i + 1, s + c * n[i], append(cs, c))

To run the process:

 enumerate(0, 0, make_empty_list())

Now here are no global data structures anymore, except the table n[] (constant data), and 'enumerate' also does not return anything, so you can change the recursive call to run in its own thread at your will. E.g. you can spawn a new thread to a recursive enumerate() call unless you have too many threads running already, in which case you wait.

Antti Huima
  • 25,136
  • 3
  • 52
  • 71