0

I want to use async() to speed up a loop in my code. So, I encapsulate looping task as a function and want to make multiple async() call to create multiple looping tasks in parallel.

This snippet gives an idea of the function:

void foo(int start, int end, std::vector<int>& output) {
    // do something
    // write results to corresponding locations in output vector
}    

My question is that when I create tasks in a loop, it seems that async does not call the function passed to it immediately. So, when I modify the variable passed to foo() and make another call, previous call's argument might be modified as well.

Here is a brief example:

#include <future>

int main(void) {
    int start = 0;
    int step = 10;
    auto output = std::vector<int>(100);                // an output vector of length 100
    auto ft_list = std::vector<std::future<void>>(10);  // want create 10 jobs in parallel
    
    // create 10 jobs
    for (auto& ft : ft_list) {
        ft = std::async(std::launch::async, [&] {foo(start, start + step, output)});
        start += step;
    }
    // above block does not execute as I intended
    // it sometimes create multiple call to foo() with same value of start
    // I guess it is because async calls foo() sometimes after the start is modified
    
    for (auto& ft : ft_list) {
        ft.wait();
    }
    
    return 0;
}

I tried some dirty hacks to make sure the variable is only modified after previous job starts (by passing a reference to a flag variable). However, I feel there should be a better way. What is the proper way to parallelize the loop here using async()?

TonyZYT
  • 1
  • 3
  • 1
    You definitely need to protect your `output` vector with a mutex. – AndyG Sep 29 '20 at 19:01
  • 1
    Also, consider what happens when you capture `start` and `stop` by *reference* instead of by value – AndyG Sep 29 '20 at 19:01
  • @AndyG As long as the threads are writing to different locations the output vector does not need a mutex. – Galik Sep 29 '20 at 19:05
  • @Galik: True. I retract my "definitely" and hereby replace it with "might". – AndyG Sep 29 '20 at 19:06
  • Should `start - step` be `start + step`? Also you may want to address potential over-shoot if the step does not divide equally into the number of elements. – Galik Sep 29 '20 at 19:08
  • Oh thanks. I should make it start + step. I only paste a simple demo here, so I do not include part to deal with dividing equally. – TonyZYT Sep 29 '20 at 19:18
  • Please remember that threads/async are not a golden hammer. Using it introduces some overhead. If this overhead is comparable with task to be performed you will not see gain in time, you can observe performance degradation. Anyway this is prove that wild card capture is bad for lambda: https://godbolt.org/z/hTGGzjPEj – Marek R Jun 23 '22 at 12:50

1 Answers1

1

I think you do not need to encapsulate your function call in a lambda. You can simply call async like so -

std::async(std::launch::async, foo, start, start + step, std::ref(output));

If you need to create a lambda, you can change it so the start and start + step values are passed as arguments to the lambda, instead of being captured and used at the time of computation of the function. Something like this -

std::async(std::launch::async,
           [&output] (int foo_start, int foo_end) {
               foo(foo_start, foo_end, output);
           },
           start, start + step);
CodePro_NotYet
  • 621
  • 6
  • 17
  • 1
    Should be: `std::async(std::launch::async, foo, start, start + step, std::ref(output));`. Anyway `output` is a problem since this is shared state and some synchronization is required. – Marek R Jun 23 '22 at 12:30
  • Synchronization will not be required in output if all threads are working on separate parts, which appears to be the case from the arguments OP is giving to foo. – CodePro_NotYet Jun 24 '22 at 13:35