C++ array operator overhead

Question

I remember reading a while back some code that allowed the compiler to do some work and simplify an expression like this one:

// edit: yes the parameters where meant to be passed by reference
//       and maintain constness sorry !
template< typename T >
std::vector<T> operator+( const std::vector<T>& a, const std::vector<T>& b )
{
    assert( a.size() == b.size() );
    std::vector<T> o; o.reserve( a.size() );
    for( std::vector<T>::size_type i = 0; i < a.size(); ++i )
        o[i] = a[i] + b[i];
    return o;
}

// same for operator* but a[i] * b[i] instead

std::vector<double> a, b, c, d, e;

// do some initialization of the vectors

e = a * b + c * d

Where normally a new vector would be created and allocated for each operator instead the compiler would instead only create one copy and do all the operations onto it.

What is this technique?

And the question is ...? Or are you just looking for [expression templates](http://en.wikipedia.org/wiki/Expression_templates)? — Angew is no longer proud of SO, Feb 07 '13 at 16:11
[This question](http://stackoverflow.com/questions/10725299/understanding-the-benefits-of-move-semantics-vs-template-metaprogramming) might have been the place where you first read about this. — us2012, Feb 07 '13 at 16:16
Given that your functions take the arguments by value, the cost of `operator[]` is going to be much lower than the overhead of the function call itself. — David Rodríguez - dribeas, Feb 07 '13 at 16:26

score 3 · Accepted Answer · edited May 23 '17 at 10:24

3

As @Agnew mentioned very early, the technique you're describing is expression templates.

This is typically done with the mathematical concept of a vector¹, not a std::vector. The broad strokes are:

Don't have math operations on your vectors return the result. Instead, have them return a proxy object that represents the operation that eventually needs to be done. a * b could return a "multiplication proxy" object that just holds const references to the two vectors that should be multiplied.
Write math operations for these proxies too, allowing them to be chained together, so a * b + c * d becomes (TempMulProxy) + (TempMulProxy) becomes (TempAddProxy), all without doing any of the math or copying any vectors.
Write an assignment operator that takes your proxy object for the right-side object. This operator can see the entire expression a * b + c * d and do that operation efficiently on your vector, while knowing the destination. All without creating multiple temporary vector objects.

¹ or matrix or quaternion, etc...*

edited May 23 '17 at 10:24

Community

1
1

answered Feb 07 '13 at 16:52

Drew Dormann

59,987
13
123
180

The "mathematical concept of a vector*" is irritating (pointers in math?). Maybe you shouldn't use an asterisk here, but a superscript number. – leemes Feb 07 '13 at 17:38
Thank you and to Angew this is what I was looking for, also thanks for actually adding a question to the original post for me :P at least someone understood what I meant. – johndoe Feb 07 '13 at 18:05
@leemes Interesting. I considered both but as mathematical concepts go, superscripted numbers have an additional meaning while an asterisk is just an asterisk. Sorry you got irritated. – Drew Dormann Feb 08 '13 at 16:20
What the last two lines of the class VecExpression mean on the link you provided (expression templates): operator E&() { return static_cast(*this); } ? – ChrisPeterson Dec 02 '14 at 16:07
Ah! sorry, didn't realize it is the conversion operator! – ChrisPeterson Dec 02 '14 at 16:14

leemes · Answer 2 · 2013-02-07T16:46:23.693

I don't see a question here. However, my crystal ball tells me that you want to know the better method of two methods you came up with in order to perform component-wise arithmetic operations on vectors like a * b + c * d where a, b, c, d are vectors (std::vector<T>) having the same size:

For each operation to be done, loop over the elements, perform the calculation and return a resulting vector. Put these operations together in a formula on vectors.
For each element in the input vectors, calculate the whole expression and write it into one single final resulting vector.

There are two things to consider:

Performance: Here, the second option is ahead, since the processor will not allocate unnecessary temporary vectors.
Re-usability: Clearly, it's nice to implement algorithmic operations for vectors and re-use them by simply expressing your target formula on vectors.

However, there is a nice option to implement the second option which looks very pretty:

std::vector<int> a, b, c, d, e;
// fill a, b, c, d with data

auto expression = [](int a, int b, int c, int d){ return a * b + c * d; };

assert (a.size() == b.size() && b.size() == c.size() && c.size() == d.size());
e.reserve(a.size());

for(auto _a = a.begin(), _b = b.begin(), _c = c.begin(), _d = d.begin(), _e = e.begin();
    _a != a.end();
    ++_a, ++_b, ++_c, ++_d, ++_e)
{
    *_e = expression(*_a, *_b, *_c, *_d);
}

This way, you can separate the expression from the logic to evaluate it:

void componentWise4(std::function<int(int,int,int,int)> f,
                    const std::vector<int> & a,
                    const std::vector<int> & b,
                    const std::vector<int> & c,
                    const std::vector<int> & d,
                    std::vector<int> & result)
{
    assert (a.size() == b.size() && b.size() == c.size() && c.size() == d.size());
    result.reserve(a.size());

    for(auto _a = a.begin(), _b = b.begin(), _c = c.begin(), _d = d.begin(), _result = result.begin();
        _a != a.end();
        ++_a, ++_b, ++_c, ++_d, ++_result)
    {
        *_result = expression(*_a, *_b, *_c, *_d);
    }
}

Which is then called like that:

std::vector<int> a, b, c, d, e;
// fill a, b, c, d with data

componentWise4([](int a, int b, int c, int d){ return a * b + c * d; },
               a, b, c, d, e);

I'm sure this "expression evaluator" can be extended using C++11 new feature "variadic templates" to support arbitrary numbers of arguments within the expression as well as even different types. I couldn't manage to get it working (the variadic template thing), you can try to finish my attempt here: http://ideone.com/w88kuG (I'm new to variadic templates, so I don't know the syntax).

This is a solution to the problem but not the one I was looking for as it requires you to create a new function for each operation you want to do on multiple vectors. It also decreases readability in a sense as it isn't as simplistic or minimal as simply "a * b + c * d". — johndoe, Feb 07 '13 at 17:34

score 1 · Answer 3 · answered Feb 07 '13 at 17:56

What do you want is in „The C++ Programming Language”. Third Edition by Bjarne Stroustrup in 22.4.7 Temporaries, Copying, and Loops [num.matrix]. It is always a good idea to read the book.

If you dont have it, basically we have two option:

First: we write a set of function for direct calculation of some of the most expected combination ( For example mul_add_and_assign(&U,&M,&V,&W)to calcule U =M*V+W) and led the user to select self what function is he most convenient.

Second: we can introduce some auxiliary classes (for example VxV, VplusV, etc.) that only keep a reference to the arguments of each operation and define an operator conversion to vector. Now we create overloads of the operators + and * that take two vectors by reference and just return an object of the corresponding type. We can create classes of the type VxVplusVxV to calculate more complex operations. Now we can overload operator= to assing VxVplusVxV to a vector. And in this last overload we made all calculation, using the references to the arguments keeped in the auxiliary classes objects, with no or minimal temporary vectors created.

Again yes this is a solution but it would require me to write code for every possible addition / multiplication of any infinite number of arrays I could ever define. — johndoe, Feb 07 '13 at 18:04

C++ array operator overhead

3 Answers3