Moving functors into std::function's while avoiding copies

Question

I am trying to move a functor into a lambda inside an object, like this:

#include <functional>
#include <iostream>

#include "boost/stacktrace.hpp"

#define fwd(o) std::forward<decltype(o)>(o)

struct CopyCounter {
  CopyCounter() noexcept = default;
  CopyCounter(const CopyCounter &) noexcept {
    std::cout << "Copied at" << boost::stacktrace::stacktrace() << std::endl;
    counter++;
  }
  CopyCounter(CopyCounter &&) noexcept = default;

  CopyCounter &operator=(CopyCounter &&) noexcept = default;
  CopyCounter &operator=(const CopyCounter &) noexcept {
    std::cout << "Copied at " << boost::stacktrace::stacktrace() << std::endl;
    counter++;
    return *this;
  }

  inline static size_t counter = 0;
};

struct Argument : CopyCounter {};

struct Functor : CopyCounter {
  int operator()(Argument) { return 42; }
};

template <class Result>
class Invoker {
  std::function<void()> invoke_;
  Result* result_ = nullptr;

  template <class Functor, class... Args>
  Invoker(Functor&& f, Args&&... args) {
    if constexpr (std::is_same_v<Result, void>) {
      invoke_ = [this, f = fwd(f), ... args = fwd(args)]() mutable {
        f(fwd(args)...);
      };
    } else {
      invoke_ = [this, f = fwd(f), ...args = fwd(args)]() mutable { 
        result_ = new Result(f(fwd(args)...));
      };
    }
  }
  template <class Functor, class... Args>
  friend auto make_invoker(Functor&& f, Args&&... args);

public:
  ~Invoker() {
    if (result_) delete result_;
  }
};

template <class Functor, class... Args>
auto make_invoker(Functor&& f, Args&&... args) {
  return Invoker<decltype(f(args...))>(fwd(f), fwd(args)...);
}

int main() {
  Functor f;
  Argument a;
  auto i = make_invoker(std::move(f), std::move(a));
  assert(CopyCounter::counter == 0);
  return 0;
}

Somewhat surprisingly, the last assert fails on libc++, but not libstdc++. The stacktrace hints at the two copies that are performed:

Copied at  0# CopyCounter at /usr/include/boost/stacktrace/stacktrace.hpp:?
 1# 0x00000000004C812E at ./src/csc_cpp/move_functors.cpp:38
 2# std::__1::__function::__value_func<void ()>::swap(std::__1::__function::__value_func<void ()>&) at /usr/lib/llvm-10/bin/../include/c++/v1/functional:?
 3# ~__value_func at /usr/lib/llvm-10/bin/../include/c++/v1/functional:1825
 4# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6
 5# _start in ./bin/./src/csc_cpp/move_functors

Copied at  0# CopyCounter at /usr/include/boost/stacktrace/stacktrace.hpp:?
 1# std::__1::__function::__value_func<void ()>::swap(std::__1::__function::__value_func<void ()>&) at /usr/lib/llvm-10/bin/../include/c++/v1/functional:?
 2# ~__value_func at /usr/lib/llvm-10/bin/../include/c++/v1/functional:1825
 3# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6
 4# _start in ./bin/./src/csc_cpp/move_functors

It seems like inside the library the functor and the argument get copied in swap, during the move-assignment of invoke_. There are two questions:

Why is this the desired behaviour and what can be the motivation behind this design solution?
What is a good way to update the code to reach the same semantics as in libstdc++?

Note that I've sorta-separated Boost.Stacktrace into [an independent library](https://github.com/eyalroz/stacktrace), no Boost required. Also, if you were working in C++23, you could use [`std::stacktrace](https://en.cppreference.com/w/cpp/utility/basic_stacktrace). — einpoklum, May 21 '22 at 20:00
Why aren't you using an initializer list in the constructor of Invoker? — einpoklum, May 21 '22 at 20:05
Why roll your own functor class in the first place if you are already using lambdas? — Karl Knechtel, May 21 '22 at 20:17

score 4 · Answer 1 · answered May 22 '22 at 01:14

libstdc++ and libc++ use different small-object optimization strategies.

In libc++, the callable is stored locally if the following hold:

the callable fits into the local buffer;
the callable is nothrow copy constructible; and
the allocator is nothrow copy constructible.

(Note: allocator support was removed from the standard in C++17. However, practically speaking, it can't be removed from standard library implementations, at least not for a very long time, without breaking existing code.)

In libstdc++, the callable is stored locally if the following hold:

the callable fits into the local buffer;
the callable is of trivially copyable type.

(I am glossing over the issue of alignment, as it's not particularly relevant here.)

libstdc++'s choice of when to use the small-object optimization implies that wherever it does apply, the callable can be copied simply by copying the array that provides storage for it. But the Functor and Argument types in your code are not trivially copyable, so the lambda closure type is not, either. Storage for the lambda is allocated out-of-line and the owned callable is move-constructed from the lambda. It never needs to be copied or moved thereafter.

On the other hand, libc++ stores your callable in the small object buffer. During the assignment to invoke_, it must swap a std::function containing the closure object in the small object buffer with a default-constructed std::function. That means the callable must be moved from one std::function instance's small object buffer to the other instance's small object buffer. libc++ accomplishes this move by copying the callable and then destroying the source of the copy.

Why does libc++ not use the move constructor? Apparently, it's a bug. See LLVM bug 33125, which is about the move constructor and not the swap function, but the same principles apply. The move constructor of std::function uses the copy constructor, not the move constructor, when the source has stored the callable in the small object buffer. The reason for this is that the type erasure interface that is used by libc++ to know how to handle the callable while forgetting its type is a class called __base that has virtual functions for copying the callable and virtual functions for destroying the callable, but doesn't have any for moving it---and apparently, adding move support would break ABI, so it can't be done until further notice.

Note that this is only a bug in the sense that it has suboptimal performance. It is not a bug in the sense of failing to conform to the standard. Both libstdc++ and libc++ have valid implementation strategies. The standard does not say how many times the callable is allowed to be copied.

The best thing to do would be to update your code so that its correctness doesn't depend on the number of times the callable is copied. If you really can't bear the cost of the copy but need to build with libc++, there are some other strategies such as:

padding the callable (by capturing a dummy object of sufficient size) so that it doesn't fit into the small object buffer;
making the callable's copy constructor noexcept(false) (by capturing an object with a noexcept(false) copy constructor) so that it won't go into the small object bufer; or
storing in the std::function a reference-semantic wrapper class that owns the actual callable a std::shared_ptr to the actual callable type and an operator() that forwards to the actual callable object.

score 2 · Answer 2 · answered May 21 '22 at 20:13

2

A partial answer:

If you initialize invoke_ using an initialization list:

class Invoker {
  std::function<void()> invoke_;
  int result_;

public:
  template <class Functor, class... Args>
  Invoker(Functor&& f, Args&&... args) :
    invoke_([this, f = fwd(f), ...args = fwd(args)]() mutable { 
      result_ = f(fwd(args)...);
    }) { }
};

the assertion does not fail. So, I'm guessing the swap is between (fields of) the default-initialized invoke_ and the functor you're assigning to invoke_.

answered May 21 '22 at 20:13

einpoklum

118,144
57
340
684

Thank you, this works well if you do not use an `if constexpr` in the body around this assignment (which is, sadly, my case). Maybe one can still go around this with some sfinae – Mike Land May 22 '22 at 05:41
@MikeLand: I'm not sure I follow... "the body" of what? "this assignment" you mean the assignment to the result? What condition do you want to impose on the assignment? – einpoklum May 22 '22 at 08:09
I've updated the question to reflect what I mean – Mike Land May 22 '22 at 10:20
@MikeLand I would suggest delegating the initialization to another constructor and using tag dispatch to distinguish between the `void` and non-`void` cases. – Brian Bi May 22 '22 at 18:34
@MikeLand: I believe it is a bit inappropriate to change the question after the fact, invalidating my answer. That said - you could do as Brian suggested; or you could perhaps put the `if constexpr` inside the lambda's body. – einpoklum May 22 '22 at 19:47

Moving functors into std::function's while avoiding copies

2 Answers2

Linked