1

I'm trying to implement the simple threadpool with the as low overhead as possible (preferably at C++14).

The general idea is to "pack" tasks to lambda without arguments (but with capture list), and provide a single public function to add tasks to threadpool.

Let's consider the following simple (and ugly) example, illustrating the problem (it's not the threadpool itself, but the simplified code snippet with helper class VerboseStr).

#include <iostream>
#include <string>
#include <functional>
#include <thread>

//  Small helper macros
#define T_OUT(str) std::cout << Ident() \
  << "[" << std::to_string(id_) << "] " \
  << str << std::endl; \

#define T_OUT_FROMTO(str) std::cout << Ident() \
  << "[" << std::to_string(i.id_) << "->" << std::to_string(id_) << "] " \
  << str << std::endl; \

#define T_SLEEP(ms) std::this_thread::sleep_for(std::chrono::milliseconds(ms));

//  Just a verbose class, holding std::string inside
/////////////////////////////////////////////////////////
class VerboseStr
{
  std::string val_;
  int id_;

  static int GetId() { static int id = 0; return ++id; }

  //  Several spaces to ident lines
  std::string Ident() const { return std::string(id_, ' '); }
public:
  VerboseStr() : id_(GetId())
  {
    T_OUT("Default constructor called");
  };

  ~VerboseStr()
  {
    val_ = "~Destroyed!";
    T_OUT("Destructor called");
  }

  VerboseStr(const std::string& i) : val_(i), id_(GetId())
  {
    T_OUT("Create constructor called");
  };

  VerboseStr(const VerboseStr& i) : val_(i.val_), id_(GetId())
  {
    val_ = i.val_;
    T_OUT_FROMTO("Copy constructor called");
  };

  VerboseStr(VerboseStr&& i) noexcept : val_(std::move(i.val_)), id_(GetId())
  {
    T_OUT_FROMTO("Move constructor called");
  };

  VerboseStr& operator=(const VerboseStr& i)
  { 
    val_ = i.val_;
    T_OUT_FROMTO("Copy operator= called");
    return *this;
  }

  VerboseStr& operator=(VerboseStr&& i) noexcept
  { 
    val_ = std::move(i.val_);
    T_OUT_FROMTO("Move operator= called");
    return *this;
  }

  const std::string ToStr() const { return std::string("[") + std::to_string(id_) + "] " + val_; }
  void SetStr(const std::string& val) { val_ = val; }
};
/////////////////////////////////////////////////////////

//  Capturing args by VALUES in lambda
template<typename Fn, typename... Args>
void RunAsync_V(Fn&& func, Args&&... args)
{
  auto t = std::thread([func_ = std::forward<Fn>(func), args...]()
  {    
    T_SLEEP(1000);  //  "Guarantees" async execution
    func_(args...);
  });
  t.detach();
}

void DealWithVal(VerboseStr str)
{
  std::cout << "Str copy: " << str.ToStr() << std::endl;
}

void DealWithRef(VerboseStr& str)
{
  std::cout << "Str before change: " << str.ToStr() << std::endl;
  str.SetStr("Changed");
  std::cout << "Str after change: " << str.ToStr() << std::endl;
}

// It's "OK", but leads to 2 calls of copy constructor
//  Replacing 'str' with 'std::move(str)' leads to no changes
void Test1()
{
  VerboseStr str("First example");

  RunAsync_V(&DealWithVal, str);
}

//  It's OK
void Test2()
{
  VerboseStr str("Second example");

  RunAsync_V(&DealWithRef, std::ref(str));

  //  Waiting for thread to complete...
  T_SLEEP(1500);

  //  Output the changed value of str
  std::cout << "Checking str finally: " << str.ToStr() << std::endl;
}

int main()
{
  Test1();
//  Test2();

  T_SLEEP(3000);  //  Give a time to finish
}

As said in comment above, the problem is in Test1() function.

It is obvious that in the context of Test1(), the only possible way to asynchronously call a function DealWithVal is to "move" str to a lambda body.

When Test1() called from main(), the output is following:

 [1] Create constructor called
  [1->2] Copy constructor called
   [2->3] Move constructor called
  [2] Destructor called
 [1] Destructor called
    [3->4] Copy constructor called
Str copy: [4] First example
    [4] Destructor called
   [3] Destructor called

As we can see, there are 2 calls of copy constructor.

I cannot realize how to implement this, considering that the passing by value (without moving) and by reference (take a look at Test2()) should be available as well.

Please help sort out the problem. Thanks in advance.

Ukridge
  • 25
  • 3
  • Why those MACRO? :/ – Jarod42 Dec 01 '20 at 10:19
  • @Jarod42 , sorry, yes, I should have created functions instead. Although in the context of the issue under discussion, this probably does not matter... – Ukridge Dec 01 '20 at 10:28
  • Are you looking for [perfectly-capturing-a-perfect-forwarder-universal-reference-in-a-lambda](https://stackoverflow.com/questions/31410209/perfectly-capturing-a-perfect-forwarder-universal-reference-in-a-lambda/31410880#31410880) – Jarod42 Dec 01 '20 at 10:36
  • @Jarod42 , thanks, it's very interesting. But can it be used with parameter pack in capture clause? – Ukridge Dec 01 '20 at 13:29

2 Answers2

1

You might capture argument as std::tuple, as follow:

//  Passing args by VALUE, Capturing moved args
template<typename Fn, typename... Args>
void RunAsync_V2(Fn&& func, Args... args)
{
    auto t = std::thread([func_ = std::forward<Fn>(func), tup = std::tuple<Args...>(std::move(args)...)]() mutable
    {
        T_SLEEP(1000);

        std::apply([&](auto&&...args){func_(std::move(args)...);}, std::move(tup));
    });
    t.detach();
}

Demo

No copies, only move is done:

[1] Create constructor called
  [1->2] Copy constructor called
   [2->3] Move constructor called
    [3->4] Move constructor called
   [3] Destructor called
  [2] Destructor called
 [1] Destructor called
     [4->5] Move constructor called
Str copy: [5] First example
     [5] Destructor called
    [4] Destructor called
Jarod42
  • 203,559
  • 14
  • 181
  • 302
  • It seems that in this case we lose opportunity to pass parameters by value. I've noticed that you have changed the Test1() code. But if we return to my version (without std::move and without sleeping in Test1() scope) - the test fails, because std::is_lvalue_reference::value is true. – Ukridge Dec 02 '20 at 07:51
  • 1
    Not clear for me lifetime you expect from lvalue/rvalue. – Jarod42 Dec 02 '20 at 08:29
  • for example, in Test1() (initial version) the `str` would be destroyed right after call to `RunAsync_V`, so passing `str` by reference doesn't make sence at all. – Ukridge Dec 02 '20 at 08:52
  • 1
    Ok, changed so lambda owns its captures. – Jarod42 Dec 02 '20 at 09:51
1

Your two copies are coming from:

auto t = std::thread([func_ = std::forward<Fn>(func), args...]()
                                                   // ^^^^^^^ here
{    
  T_SLEEP(1000);  //  "Guarantees" async execution
  func_(args...);
     // ^^^^^^^ and here

You copy into the lambda, which is good, and which is all you can do since you were passed an lvalue reference. Then you copy the args from the lambda into the by-value function. But by design, you own the args now and the lambda has no further purpose, so you should move them from the lambda. i.e.:

auto t = std::thread([func_ = std::forward<Fn>(func), args...]() mutable
                                                              // ^^^^^^^
{    
  T_SLEEP(1000);  //  "Guarantees" async execution
  func_(std::move(args)...);
     // ^^^^^^^^^^^^^^^

This reduces you to the one necessary copy.


The other answer implicitly wraps passed lvalues in reference_wrappers to get zero copies. The caller is required to maintain the value for the async lifetime, but there is no explicit record of that at the call-site. It runs counter to expectations from analogous functionality (e.g. std::thread applies a decay_copy and requires the caller to wrap in a reference wrapper if that's what they want).

Jeff Garrett
  • 5,863
  • 1
  • 13
  • 12
  • Gattet, thank you for parcipiation! Your version gave me new thoughts. Please, take look at the following - [link](http://coliru.stacked-crooked.com/a/516b5e90c19bd5ae). At this time, I'm trying to simplify solution - the RunAsync_V2, which accepts parameters by VALUE itself (which, in particular, forces caller to use std::ref when appropriate, but it's OK for my case). But I cannot realize, how to "move" parameter pack in lambda capture in C++14. The description is in comments in `Test1()` – Ukridge Dec 02 '20 at 08:42
  • Sorry, I typed your name incorrectly in the previous comment :( PS. If "moving" parameter pack is not possible in C++14 at all - I'll prefer to provide several versions of `RunAcync` in my threadpool, allowing to pass 1,2,3... parameters. But of course, it's always better to have more general solution. – Ukridge Dec 02 '20 at 08:57
  • 1
    *Ideally*, what you want to write would have `RunAsync` take forwarding references `Args&&... args`, so as to keep references as long as possible, and you would initialize the lambda captures by-value with forwarding `[func_ = std::forward(func), ...args_ = std::forward(args)]() mutable` resulting in a move into the lambda when possible, and copy if not. Then in the lambda you'd move into the call `std::move(func_)(std::move(args_)...)` (or even better `std::invoke`). You **can** write this, but in C++20. :( – Jeff Garrett Dec 05 '20 at 23:29
  • 1
    The variadic lambda initializer is what is not available in C++14. You can however write basically the same thing by smuggling through a tuple: `[func_ = std::forward(func), args_ = std::tuple(std::forward(args)...)]() mutable`. This is very close to the updated answer from Jarod42 (which has Run taking by value, and moving into the lambda, resulting in an additional move relative to taking by reference and forwarding into the lambda -- a minor difference). – Jeff Garrett Dec 05 '20 at 23:29
  • 1
    Then of course, inside the lambda, you need something like `std::apply` which is in C++17: `apply(std::move(func_), std::move(args_))` (I don't think I'd bother trying to emulate `std::invoke` before C++17, but you could depending on your needs). I'm not sure why Jarod42 wraps the call inside `apply` with a lambda. For implementing `apply`, you can copy the example implementation from cppreference,for example (replace `std::invoke` with a call expression). Happy coding! – Jeff Garrett Dec 05 '20 at 23:29