11

std::experimental::apply has the following signature:

template <class F, class Tuple>
constexpr decltype(auto) apply(F&& f, Tuple&& t);

It basically invokes f by expanding t's elements as the arguments.


I would like something that does the exact same thing, but with multiple tuples at the same time:

template <class F, class... Tuples>
constexpr decltype(auto) multi_apply(F&& f, Tuples&&... ts);

Example usage:

std::tuple t0{1, 2, 3};
std::tuple t1{4, 5, 6};
auto sum = [](auto... xs){ return (0 + ... + xs); };

assert(multi_apply(sum, t0, t1) == 1 + 2 + 3 + 4 + 5 + 6);

I can think of various naive way of implementing multi_apply:

  • Use std::tuple_cat and then calling std::experimental::apply.

  • Use recursion to bind every tuple's arguments to a series of lambdas that eventually call the original function.

But what I'm asking is: how can I implement multi_apply without resorting to std::tuple_cat or recursion?

What I ideally would like to do is: generate an std::index_sequence for every tuple and match every tuple with its own index sequence in the same variadic expansion. Is this possible? Example:

// pseudocode-ish
template <class F, std::size_t... Idxs, class... Tuples>
constexpr decltype(auto) multi_apply_helper(
    F&& f, std::index_sequence<Idxs>... seqs,  Tuples&&... ts)
{
    return f(std::get<Idxs>(ts)...);
} 
Vittorio Romeo
  • 90,666
  • 33
  • 258
  • 416
  • 4
    What's the problem with `std::tuple_cat`? – bennofs Jan 28 '17 at 16:52
  • 1
    You could build your own versions of tuple_size and get that work recursively, then the implementation of multi_apply is essentially the same as apply. – Marc Glisse Jan 28 '17 at 17:38
  • are you trying to optimize compile duration? are your tuples of varying sizes? – m.s. Jan 28 '17 at 18:02
  • 2
    If this is about what I think it's about, `tuple_cat` is just fine as long as you use it on tuples of references. – T.C. Jan 28 '17 at 22:40
  • 1
    This question isn't very useful unless we know what you think is wrong with tuple_cat, as a valid answer could just literally reimplememt tuple_cat with a different name. Also you've put multiple questions into a single question, which is frowned upon. – xaxxon Jan 28 '17 at 23:27

2 Answers2

12

Here's my take on it. It doesn't use recursion and it expands those tuples in the same pack expansion, but it requires a bit of preparation:

  • We build a tuple of references to the tuples passed in, rvalue references for rvalue arguments, lvalue references for lvalue arguments, in order to have proper forwarding in the final call (exactly what std::forward_as_tuple does, as T.C. noted in the comments). The tuple is built and passed around as an rvalue, so reference collapsing ensures correct value categories for each argument in the final call to f.
  • We build two flattened index sequences, both of size equal to the sum of all tuple sizes:
    • The outer indices select the tuple, so they repeat the same value (the tuple's index in the tuple pack) a number of times equal to the size of each tuple.
    • The inner ones select the element in each tuple, so they increase from 0 to one less than the tuple size for each tuple.

Once we have that in place, we just expand both index sequences in the call to f.

#include <tuple>
#include <array>
#include <cstddef>
#include <utility>
#include <type_traits>
#include <iostream>

template<std::size_t S, class... Ts> constexpr auto make_indices()
{
   constexpr std::size_t sizes[] = {std::tuple_size_v<std::remove_reference_t<Ts>>...};
   using arr_t = std::array<std::size_t, S>;
   std::pair<arr_t, arr_t> ret{};
   for(std::size_t c = 0, i = 0; i < sizeof...(Ts); ++i)
      for(std::size_t j = 0; j < sizes[i]; ++j, ++c)
      {
         ret.first[c] = i;
         ret.second[c] = j;
      }
   return ret;
}

template<class F, class... Tuples, std::size_t... OuterIs, std::size_t... InnerIs> 
constexpr decltype(auto) multi_apply_imp_2(std::index_sequence<OuterIs...>, std::index_sequence<InnerIs...>, 
                                           F&& f, std::tuple<Tuples...>&& t)
{
   return std::forward<F>(f)(std::get<InnerIs>(std::get<OuterIs>(std::move(t)))...);
}

template<class F, class... Tuples, std::size_t... Is> 
constexpr decltype(auto) multi_apply_imp_1(std::index_sequence<Is...>, 
                                           F&& f, std::tuple<Tuples...>&& t)
{
   constexpr auto indices = make_indices<sizeof...(Is), Tuples...>();
   return multi_apply_imp_2(std::index_sequence<indices.first[Is]...>{}, std::index_sequence<indices.second[Is]...>{},
      std::forward<F>(f), std::move(t));
}

template<class F, class... Tuples> 
constexpr decltype(auto) multi_apply(F&& f, Tuples&&... ts)
{
   constexpr std::size_t flat_s = (0U + ... + std::tuple_size_v<std::remove_reference_t<Tuples>>);
   if constexpr(flat_s != 0)
      return multi_apply_imp_1(std::make_index_sequence<flat_s>{}, 
         std::forward<F>(f), std::forward_as_tuple(std::forward<Tuples>(ts)...));
   else
      return std::forward<F>(f)();
}

int main()
{
   auto t0 = std::make_tuple(1, 2);
   auto t1 = std::make_tuple(3, 6, 4, 5);
   auto sum = [](auto... xs) { return (0 + ... + xs); };

   std::cout << multi_apply(sum, t0, t1, std::make_tuple(7)) << '\n';
}

It compiles on the trunk versions of Clang and GCC in C++1z mode. In terms of generated code, GCC with -O2 optimizes the call to multi_apply to a constant 28.


Replacing std::array with a built-in array inside make_indices by using arr_t = std::size_t[S]; makes it compile on Clang 3.9.1 (that version of libc++ lacks constexpr on std::array's operator[]).

Further replacing std::tuple_size_v with std::tuple_size<X>::value and removing the if constexpr test in multi_apply makes it compile on GCC 6.3.0. (The test handles the cases when no tuples are passed in or all tuples passed in are empty.)

Further replacing the uses of fold expressions with calls like

sum_array({std::tuple_size_v<std::remove_reference_t<Tuples>>...})

where sum_array can be something simple like

template<class T, std::size_t S> constexpr T sum_array(const T (& a)[S], std::size_t i = 0)
{
   return i < S ? a[i] + sum_array(a, i + 1) : 0;
}

makes it compile on the latest MSVC 2017 RC (MSVC actually has std::tuple_size_v, but it needs the other changes). The generated code is still great: after replacing the body of the sum lambda with sum_array({xs...}), the resulting code is a direct call to sum_array with the array built in-place directly from the elements of all tuples, so the multi_apply machinery doesn't introduce any run time overhead.


std::apply is defined in terms of INVOKE, so, to keep things consistent, the final call to f should be

std::invoke(std::forward<F>(f), std::get<InnerIs>(std::get<OuterIs>(std::move(t)))...)

Implementations may provide a noexcept-specifier on std::apply (at least, libc++ does; libstdc++ and MSVC currently don't) so that may be worth considering too.

bogdan
  • 9,229
  • 2
  • 33
  • 48
  • `std::tuple{std::forward(ts)...}` this is just `forward_as_tuple`, no? – T.C. Jan 28 '17 at 23:06
  • 2
    @T.C. Absolutely right. Just look at my spankin' new reinvented wheel... it's just like the other one, but it's *mine*. – bogdan Jan 28 '17 at 23:24
  • Great answer. Exactly what I hoped to see! :) – Vittorio Romeo Jan 29 '17 at 12:07
  • @VittorioRomeo Cheers! I've added some implementation notes at the end. – bogdan Jan 29 '17 at 12:52
  • @bogdan: unfortunately [clang++3.9.1 does not seem to like the `make_indices`](http://melpon.org/wandbox/permlink/dWWS2lTH8CcVCznY) function. It does work with clang SVN though. – Vittorio Romeo Jan 29 '17 at 14:18
  • 1
    @VittorioRomeo It's due to that version of libc++ lacking `constexpr` on `std::array`'s `operator[]`. You can replace `std::array` with a built-in array inside `make_indices` by `using arr_t = std::size_t[S];`. Actually, I think I'll add this to the answer. – bogdan Jan 29 '17 at 14:48
  • @VittorioRomeo Done, with additional solutions for GCC 6.3.0 and MSVC 2017 RC. – bogdan Jan 29 '17 at 16:18
2

Alternative version:

template <class F, std::size_t... Is, class ... Ts>
constexpr decltype(auto) multiple_apply_impl(F&& f, std::index_sequence<Is...>, Ts&&... ts)
{
    constexpr auto p = [](){
        constexpr auto total_size = sizeof...(Is);
        std::array<std::size_t, total_size> outer{};
        std::array<std::size_t, total_size> inner{};
        std::size_t global_index = 0;
        std::size_t outer_value = 0;

        [[maybe_unused]] auto l = [&](std::size_t size)
        {
            for (std::size_t i = 0; i != size; ++i) {
                outer[global_index] = outer_value;
                inner[global_index] = i;
                ++global_index;
            }
            ++outer_value;
        };
        (l(std::tuple_size<std::decay_t<Ts>>::value), ...);

        return make_pair(outer, inner);
    }();
    [[maybe_unused]] constexpr auto outer = p.first;
    [[maybe_unused]] constexpr auto inner = p.second;

    using std::get;
    return std::invoke(std::forward<F>(f),
                       get<inner[Is]>(get<outer[Is]>(std::forward_as_tuple(std::forward<Ts>(ts)...)))...);
}

template <class F, class ... Ts>
constexpr decltype(auto) multiple_apply(F&& f, Ts&&... ts)
{
    constexpr auto total_size = (std::size_t{0} + ... + std::tuple_size<std::decay_t<Ts>>::value);

    return multiple_apply_impl(std::forward<F>(f),
                               std::make_index_sequence<total_size>(),
                               std::forward<Ts>(ts)...);
}

Demo

Jarod42
  • 203,559
  • 14
  • 181
  • 302