5

Consider the following function accept that takes a "universal reference" of type T and forwards that to a parse<T>() function object with an overload for lvalues and one for rvalues:

template<class T>
void accept(T&& arg)
{
    parse<T>()(std::forward<T>(arg), 0); // copy or move, depending on rvaluedness of arg
}

template<class T>
class parse
{
    // parse will modify a local copy or move of its input parameter
    void operator()(T const& arg, int n) const { /* optimized for lvalues */ }
    void operator()(T&& arg)    , int n) const { /* optimized for rvalues */ }
};

Since perfect forwarding leaves the source object in a valid but undefined state, it is impossible to perfectly forward again within the same scope. Below my attempt to have as few copies as possible in a hypothetical split() function that takes an int representing the number passes that have to be made over the input data:

template<class T>
void split(T&& arg, int n)
{
    for (auto i = 0; i < n - 1; ++i)
        parse<T>()(arg , i);                 // copy n-1 times
    parse<T>()(std::forward<T>(arg), n - 1); // possibly move the n-th time
}

Question: is this the recommended way to apply perfect forwarding for multiple passes over the same data? If not, what is a more idiomatic way to minimize the number of copies?

TemplateRex
  • 69,038
  • 19
  • 164
  • 304
  • What do you need the instance of `T` for, and does `parse` *consume* it? Can you for example `move()` the instance back on return from `parse()`? (my point being, if `parse()` consumes it, then you can only forward to it once, else, forward and move back, or better yet, always deal with the const reference...) – Nim Nov 25 '13 at 14:51
  • @Nim no, `parse()` will modify `arg` and then discard the result, maybe writing some output or return an integer value, but not `arg` itself. – TemplateRex Nov 25 '13 at 14:53
  • mmm, interesting, you modify `arg` which is passed via *const reference*? – Nim Nov 25 '13 at 14:54
  • 1
    The question is misleading, you don't want *perfect forwarding* multiple times – David Rodríguez - dribeas Nov 25 '13 at 14:56
  • @Nim no, it will make a local copy of course, and modify the copy. With the `T&&` overload, I can modify on `std::move(arg)` rather than on a local copy. – TemplateRex Nov 25 '13 at 14:56
  • @DavidRodríguez-dribeas sorry, modified to better cover the topic. – TemplateRex Nov 25 '13 at 14:57
  • @TemplateRex, then what you have is the best you are going to get; on the last call, a copy is elided because of the forward - and you can only elide this copy... – Nim Nov 25 '13 at 15:00
  • Difficult to make any further suggestions unless we get a feel for what `T` is, you could for example extract out the state you modify and see if you can make copying of this cheap, but it's hard to say... – Nim Nov 25 '13 at 15:03
  • @Nim for my current use cases, `T` is a `std::array`, so no optimized move constructor, but the `parse()` function is generic so I want to be able to pass say a `std::vector` as well. – TemplateRex Nov 25 '13 at 15:06
  • 1
    There is nothing you can do but copy. You're trying to maky many values out of a single value. What other techniques are there for that? – Xeo Nov 25 '13 at 15:08
  • @Xeo I'm optimizing a piece of working code and haven't actually got much experience with perfect forwarding, so I wanted to make sure (as [STL put it](http://view.officeapps.live.com/op/view.aspx?src=http%3a%2f%2fvideo.ch9.ms%2fsessions%2fgonat%2f2013%2fSTLGN13Compiler.pptx): "Write code only when you know how it'll behave [...] especially true for rvalue references. If you don't know, ask an expert.") – TemplateRex Nov 25 '13 at 15:26
  • 3
    If you pass a non-`const` lvalue reference to `split()`, then `T` will deduce to an lvalue reference type and the compiler will refuse to overload both `operator()(T const&)` and `operator()(T&)` since `T&&` collapses to `&` when T is an lvalue reference ([Example at Coliru](http://coliru.stacked-crooked.com/a/e6c46b1a219b2943)). I think you want [`parse::type>` instead of `parse`](http://coliru.stacked-crooked.com/a/b39100c36a53dc8e). – Casey Nov 25 '13 at 16:03

2 Answers2

9

Question: is this the recommended way to apply perfect forwarding for multiple passes over the same data?

Yes, this is the recommended way to apply perfect forwarding (or move) when you need to pass the data multiple times. Only (potentially) move from it on your last access. Indeed, this scenario was foreseen in the original move paper, and is the very reason that "named" variables declared with type rvalue-reference are not implicitly moved from. From N1377:

Even though named rvalue references can bind to an rvalue, they are treated as lvalues when used. For example:

struct A {};

void h(const A&);
void h(A&&);

void g(const A&);
void g(A&&);

void f(A&& a)
{
    g(a);  // calls g(const A&)
    h(a);  // calls h(const A&)
}

Although an rvalue can bind to the "a" parameter of f(), once bound, a is now treated as an lvalue. In particular, calls to the overloaded functions g() and h() resolve to the const A& (lvalue) overloads. Treating "a" as an rvalue within f would lead to error prone code: First the "move version" of g() would be called, which would likely pilfer "a", and then the pilfered "a" would be sent to the move overload of h().

If you want h(a) to move in the above example, you have to do so explicitly:

    h(std::move(a));  // calls h(A&&);

As Casey points out in the comments, you have an overloading problem when passing in lvalues:

#include  <utility>
#include  <type_traits>

template<class T>
class parse
{
    static_assert(!std::is_lvalue_reference<T>::value,
                               "parse: T can not be an lvalue-reference type");
public:
    // parse will modify a local copy or move of its input parameter
    void operator()(T const& arg, int n) const { /* optimized for lvalues */ }
    void operator()(T&& arg     , int n) const { /* optimized for rvalues */ }
};

template<class T>
void split(T&& arg, int n)
{
    typedef typename std::decay<T>::type Td;
    for (auto i = 0; i < n - 1; ++i)
        parse<Td>()(arg , i);                 // copy n-1 times
    parse<Td>()(std::forward<T>(arg), n - 1); // possibly move the n-th time
}

Above I've fixed it as Casey suggests, by instantiating parse<T> only on non-reference types using std::decay. I've also added a static_assert to ensure that the client does not accidentally make this mistake. The static_assert isn't strictly necessary because you will get a compile-time error regardless. However the static_assert can offer a more readable error message.

That is not the only way to fix the problem though. Another way, which would allow the client to instantiate parse with an lvalue reference type, is to partially specialize parse:

template<class T>
class parse<T&>
{
public:
    // parse will modify a local copy or move of its input parameter
    void operator()(T const& arg, int n) const { /* optimized for lvalues */ }
};

Now the client doesn't need to do the decay dance:

template<class T>
void split(T&& arg, int n)
{
    for (auto i = 0; i < n - 1; ++i)
        parse<T>()(arg , i);                 // copy n-1 times
    parse<T>()(std::forward<T>(arg), n - 1); // possibly move the n-th time
}

And you can apply special logic under parse<T&> if necessary.

Community
  • 1
  • 1
Howard Hinnant
  • 206,506
  • 52
  • 449
  • 577
  • +1 and accepted, of course. The 2002 paper is really clear, tnx for linking to that. Just to make sure: this solution will not work for passing the same parameter twice to the same forwarding function because of the undefined order of evaluation? I.e. `foo(arg, std::forward(arg));` does not have a deterministic last evaluated argument, right? So there one would always incur an extra copy. – TemplateRex Nov 25 '13 at 18:25
  • 1
    @TemplateRex: If you are passing to something that is accepting by reference (lvalue or rvalue, as in your example), **and** if you know what order the passed-to function will move its arguments, then you can move/forward the last one that the passed-to function will move. The forward/move doesn't actually do anything but cast to rvalue. Things only get moved (if passed by reference) when the passed-to function moves the argument. If you are passing to something that is accepting by value, then you **do** have a problem with the unspecified order of evaluation of argument binding. – Howard Hinnant Nov 25 '13 at 19:05
  • OK, got it, very illuminating. – TemplateRex Nov 25 '13 at 19:09
0

(I know, it is an old thread)

As stated in the comments, the data is a large array or vector of uint64_t. A better optimization than parameter passing to prevent a final copy would probably be to optimize the many copy operations to

  • read once
  • write many times (for each intended pass)

in one step instead of many independent copies.

A starting point could be this faster alternative to memcpy? which has answers that include memcpy-like code. You would have to multiply the code line that writes to the destination to write several copies of the data instead.

You can also combine memset, which is optimized for writing the same value to memory over and over again, and memcpy, which is optimized for reading and writing blocks of memory once for each block. You could look into optimized source code here: https://github.com/KNNSpeed/AVX-Memmove

The best code will be specific to the architecture and processor used. So you would have to test and compare your achieved speed.

Sebastian
  • 1,834
  • 2
  • 10
  • 22