Start multiple threads and wait only for one to finish to obtain results

Question

Assuming I have the function double someRandomFunction(int n) that takes an integer and returns double but it's random in the sense that it tries random stuff to come up with the solution so even though you run the function with the same arguments, sometimes it can take 10 seconds to finish and other 40 seconds to finish.

The double someRandomFunction(int n) functions itself is a wrapper to a black box function. So the someRandomFunction takes a while to complete but I don't have control in the main loop of the black box, hence I can't really check for a flag variable within the thread as the heavy computation happens in a black box function.

I would like to start 10 threads calling that function and I am interested in the result of the first thread which finishes first. I don't care which one it's I only need 1 result from these threads.

I found the following code:

  std::vector<boost::future<double>> futures;
  for (...) {
    auto fut = boost::async([i]() { return someRandomFunction(2) });
    futures.push_back(std::move(fut));
  }

  for (...) {
    auto res = boost::wait_for_any(futures.begin(), futures.end());
    std::this_thread::yield();
    std::cout << res->get() << std::endl;
  }

Which is the closest to what I am looking for, but still I can't see how I can make my program to terminate the other threads as far as one thread returns a solution.

I would like to wait for one to finish and then carry on with the result of that one thread to continue my program execution (i.e., I don't want to terminate my program after I obtain that single result, but I would like to use it for the remaining program execution.).

Again, I want to start up 10 threads calling the someRandomFunction and then wait for one thread to finish first, get the result of that thread and stop all the other threads even though they didn't finish their work.

The simplest solution would probably be to have a global flag (e.g. a `std::atomic`) that is set when you get your first result. Each thread has to regularly check this flag and terminate if it is et. — Max Langhof, Dec 11 '18 at 15:05
Maybe add a std::atomic flag and set it to true once the first thread is done; Check this flag in all threads and std::terminate if neccessary. edit: Ninja'd ;) — Fubert, Dec 11 '18 at 15:06
Thanks, but why should I check the atomic flag within the threads? I don't need them to know that someone finished, I just need to get a result from a single one and then "kill" the others. I am really then looking for a way to get a result from a sequence of threads and kill the others. I assume I can kill the threads from the main thread? — Phrixus, Dec 11 '18 at 15:10
@John you want to check the flag inside your thread function regularly so it can terminate itself. — Fubert, Dec 11 '18 at 15:11
I had to clarify that the function is quite like a black-box in the sense that the time-consuming computation happens within black-box libraries which I don't' have control over. So adding flag checking it's not feasible. — Phrixus, Dec 11 '18 at 15:15
@John Because it is clean. Killing threads may seem easy but I doubt you get many guarantees regarding _any_ data that they interacted with. Including any futures or similar. I would not consider killing threads from outside a solution, even if you solemly swear that they affect no outside data (how do you know the implementation does not?). — Max Langhof, Dec 11 '18 at 15:15
@John Even if it's a blackbox, you can wrap that function in your own 'launcher' where all additional logic is handled. — Fubert, Dec 11 '18 at 15:16
@Fubert I see the issue though - if the main work is one long-running library call, adding a flag check before and after won't help. — Max Langhof, Dec 11 '18 at 15:16
Exactly @MaxLanghof. I can add checking but the check will happen either before the start of the heavy computation or after. — Phrixus, Dec 11 '18 at 15:18
@John Are you sure that calling the functions in parallel is safe? They are evidently not pure if multiple calls with the same argument can return different results, so they must modify some state. Does the library synchronize these modifications? Even if it does, you really won't get a guarantee that killing a thread in the middle of that black box calculation keeps everything intact (unless you have the library sources). I can easily imagine it leading to a deadlock. — Max Langhof, Dec 11 '18 at 15:20
Yes, they are. I create local copies of the data that I pass to the library so they are eventually playing with these data in parallel but isolated as each thread has a copy of the original data (i.e each thread is not manipulating the original data). — Phrixus, Dec 11 '18 at 15:21
@John I think you're out of luck then. Afaik there is no (portable, standard c++) way to non-cooperatively kill a single thread. See [this question](https://stackoverflow.com/a/12207835/10729041) also. Edit: Maybe you can spawn multiple processes instead of threads, idk — Fubert, Dec 11 '18 at 15:23
@John You could always abandon the threads and just let them finish, which would be wasteful but possible. — Fubert, Dec 11 '18 at 15:31
Just another thought regarding multiple processes: Maybe you can outsource the computation to another process (which handles all the multi-threading) and then terminate that process once the first result is in. You need some way to communicate (return-values/sockets/etc.), but you could safely spawn/terminate this process on demand. — Fubert, Dec 11 '18 at 16:06
Yes, spawning a single sub-process is the way to go. It's a bit of work, but I don't see an easier solution. — TonyK, Dec 11 '18 at 16:36
Thank you, everyone, for your time. I managed to solve it. See comment under the accepted answer. — Phrixus, Dec 12 '18 at 09:32

Ted Lyngmo · Accepted Answer · 2018-12-11T17:10:53.117

If the data structure supplied to the black-box has some obvious start and end values, one way to make it finish early could be to change the end value while it's computing. It could of course cause all sorts of trouble if you've misunderstood how the black-box must work with the data, but if you are reasonably sure, it can work.

main spawns 100 outer threads that each spawn one inner thread that calls the blackbox. The inner thread receives the blackbox result and notifies all waiting threads that it's done. The outer thread waits for any inner thread to get done and then modifies the data for its own blackbox to trick it to finish.

No polling (except for the spurious wakeup loops) and no detached threads.

#include <iostream>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <vector>
#include <chrono>

// a work package for one black-box
struct data_for_back_box {
    int start_here;
    int end_here;
};

double blackbox(data_for_back_box* data) {
    // time consuming work here:
    for(auto v=data->start_here; v<data->end_here; ++v) {
        std::this_thread::sleep_for(std::chrono::milliseconds(10));
    }
    // just a debug
    if(data->end_here==0) std::cout << "I was tricked into exiting early\n";
    return data->end_here;
}

// synchronizing stuff and result
std::condition_variable cv;
std::mutex mtx;
bool done=false;
double result;

// a wrapper around the real blackbox
void inner(data_for_back_box* data) {
    double r = blackbox(data);
    if(done) return; // someone has already finished, skip this result

    // notify everyone that we're done
    std::unique_lock<std::mutex> lock(mtx);
    result = r;
    done=true;
    cv.notify_all();
}

// context setup and wait for any inner wrapper
// to signal "done"
void outer(int n) {
    data_for_back_box data{0, 100+n*n};
    std::thread work(inner, &data);
    {
        std::unique_lock<std::mutex> lock(mtx);
        while( !done ) cv.wait(lock);
    }
    // corrupt data for blackbox:
    data.end_here = 0;
    // wait for this threads blackbox to finish
    work.join();
}

int main() {
    std::vector<std::thread> ths;

    // spawn 100 worker threads
    for(int i=0; i<100; ++i) {
        ths.emplace_back(outer, i);
    }

    double saved_result;
    {
        std::unique_lock<std::mutex> lock(mtx);
        while( !done ) cv.wait(lock);
        saved_result = result;
    } // release lock

    // join all threads
    std::cout << "got result, joining:\n";
    for(auto& th : ths) {
        th.join();
    }

    std::cout << "result: " << saved_result << "\n";
}

Just out of curiosity: Did you actually get this to work with the real black-box? :-) — Ted Lyngmo, Dec 12 '18 at 09:23
Thanks Ted. Yes, I managed to do it. Your solution is very close to the one I came up with. The black-box provided a way to tell it to stop not only with time-out condition but with a boolean condition. So I have a mutex flag that any thread can set to true if done, and then the condition to stop for all the threads' black box function is this flag to become true. — Phrixus, Dec 12 '18 at 09:29
@John Oh, nice, then it's actually clean and not risky at all. Great! — Ted Lyngmo, Dec 12 '18 at 09:31

Start multiple threads and wait only for one to finish to obtain results

1 Answers1

Linked