Have anybody written a C++ STL-compliant algorithm that combines std::transform
and std::accumulate
into a single pass algorithm supporting both the unary, binary and perhaps even (n-ary!) variant, say std::transformed_accumulate
? I want this because I have found this pattern highly reusable in for example linear algebra for example in (l1-)norm calculations. The l1-norm calculates the sum of the absolute values of the elements.

- 11,838
- 10
- 52
- 99
-
4If I am not wrong, you are asking for mapreduce. – Manoj R Nov 19 '12 at 11:51
5 Answers
Uhm... My bet is that you can do that by embedding your transformation into the binary predicate, tranform the element and accumulate after the transformation.
struct times2accumulator {
int operator()( int oldvalue, int newvalue ) const {
return oldvalue + 2*newvalue;
}
};
int r = std::accumulate( v.begin(), v.end(), 2, times2accumulator() );
That functor would be equivalent to:
struct times2 {
int operator()( int x ) {
return 2*x;
}
};
std::vector<int> tmp; tmp.reserve( v.size() );
std::transform( v.begin(), v.end(), std::back_inserter(tmp), times2 );
int r = std::accumulate( tmp.begin(), tmp.end(), 0 );
Of course this could be made generic, just pass the transformation functor to a generic base functor:
template <typename Transform>
struct transform_accumulator_t {
Transform t;
transform_accumulator_t( Transform t ) : t(t) {}
int operator()( int oldvalue, int newvalue ) const {
return oldvalue + t(newvalue);
}
};
// syntactic sugar:
template <typename T>
transform_accumulator_t<T> transform_accumulator( T t ) {
return transform_accumulator_t<T>(t);
}
int r = std::accumulate(v.begin(), v.end(), 0, transform_accumulator(times2));
And you could also generalize on the type in the container... or even create a more generic transform_accumulator that takes both an accumulator and a transformation functors and applies them in order. Actual implementation left as an exercise for the reader.

- 204,818
- 23
- 294
- 489
-
Clever... Can we perhaps simplify this using C++11 lambda expressions? – Nordlöw May 14 '12 at 23:28
-
Although it may not exactly fit the original intent, std::inner_product
is basically your binary version. You pass it an initial value, two ranges, and two functors, and it applies them as:
T acc = initial_value;
while (begin1 != end1) {
acc = binary_op1(acc, binary_op2(begin1, begin2);
++begin1;
++begin2;
return acc;
So, for your L1 you'd do something on this general order:
norm = std::inner_product(input1.begin(), input1.end(),
input2.begin(), input2.end(),
std::plus<int>(), std::abs);
Only that doesn't quite work -- right now, it's trying to pass std::abs
where you really need a binary function that combines the two inputs, but I'm not sure how the two inputs are really supposed to be combined.
std::partial_sum
is fairly close to your unary version, except that along with accumulating a result, it (attempts to) write out each intermediate result, not just the final result. To just get the final result, you'd have to write (and pass an instance of) a kind of do-nothing iterator that just holds a single value:
template<class T, class Dist=size_t, class Ptr = T*, class Ref = T&>
class unique_it : public std::iterator<std::random_access_iterator_tag, T, Dist, Ptr, Ref> {
T &value;
public:
unique_it(T &v) : value(v) {}
T &operator*() { return value; }
unique_it &operator++() { return *this; }
unique_it &operator+(size_t) { return *this; }
unique_it &operator++(int) { return *this; }
};
template <class T>
unique_it<T> make_res(T &v) { return unique_it<T>(v); }
With this, your L1 normalization would look something like this:
int main(){
double result=0.0;
double inputs[] = {1, -2, 3, -4, 5, -6};
std::partial_sum(
inputs, inputs+6,
make_res(result),
[](double acc, double v) {return acc + std::abs(v);});
std::cout << result << "\t";
return 0;
}

- 476,176
- 80
- 629
- 1,111
-
Thanks even more for relevant references to other reusable STL algorithms. – Nordlöw May 15 '12 at 11:10
-
-
-
If lambdas are available should I prefer using, say, `[](T x, T y) { return x+y; }` over `std::plus
`? – Nordlöw May 17 '12 at 21:56 -
2@Nordlöw: I most go with whatever's simplest. In this case, that seems to be `plus`. The exception would be if you're using a *lot* of lambdas anyway, in which case mixing in a few pre-built functors could lead to confusion. – Jerry Coffin May 17 '12 at 21:59
As of C++17 there is also std::transform_reduce
, which also has the benefit of being parallelizable.
https://en.cppreference.com/w/cpp/algorithm/transform_reduce

- 1,455
- 19
- 27
If you want to use some parallelism, I made a quick version using OpenMP :
template <class T,
class InputIterator,
class MapFunction,
class ReductionFunction>
T MapReduce_n(InputIterator in,
unsigned int size,
T baseval,
MapFunction mapper,
ReductionFunction reducer)
{
T val = baseval;
#pragma omp parallel
{
T map_val = baseval;
#pragma omp for nowait
for (auto i = 0U; i < size; ++i)
{
map_val = reducer(map_val, mapper(*(in + i)));
}
#pragma omp critical
val = reducer(val, map_val);
}
return val;
}
It is fast but there is certainly room for optimisation, especially around for (auto i = 0U; i < size; ++i)
I think. (But I couldn't figure how to make an iterator-only version with OpenMP, any help would be appreciated!).
On a quick test with 1000000 elements array, and the computation iterated 1000 times to have a mean value, I made some comparisons.
Version 1 :
for (auto i = 0U; i < size; ++i)
val += std::pow(in[i][0], 2) + std::pow(in[i][1], 2);
score when compiled with:
g++
: 30 secondsg++ -O3
: 2.6 seconds
Version 2 :
This version is the most optimized for this computation I think. (It gives the best result).
#pragma omp parallel reduction( + : val )
{
double map_val = 0.0;
#pragma omp for
for (int i=0; i < size; ++i)
{
map_val += std::pow(in[i][0], 2) + std::pow(in[i][1], 2);
}
val += map_val;
}
g++ -O3
: 0.2 seconds (it's the best one)
Version 3
This version uses the MapReduce_n function template I shown earlier :
double val = MapReduce_n(in, size, 0.0, [] (fftw_complex val)
{
return std::pow(val[0], 2.0) + std::pow(val[1], 2.0);
}, std::plus<double>());
g++ -O3
: 0.4 seconds, so there is a slight overhead for not using directly the OMP reduce directly. However, it doesn't allows custom operators, so at one point you (sadly) have to trade speed for genericity.

- 7,412
- 3
- 54
- 75
-
Nice! One thing though...is `std::pow(x, 2.0)` really faster than `x*x`?! – Nordlöw Aug 19 '13 at 22:55
-
According to this one should be inlinee to the other http://stackoverflow.com/questions/6321170/is-there-any-advantage-to-using-powx-2-instead-of-xx-with-x-double. However I prefer to write `pow` because it looks more like the formula, and because there is a lot of other `pow` in the software I took this source-code from, so the "looks" are consistent. – Jean-Michaël Celerier Aug 20 '13 at 03:30
I am surprised noone said how to do this with Boost.Range:
accumulate(v | transformed((int(*)(int))&std::abs), 0);
where v is a Singe Pass Range (ie, any STL container). The abs overload has to be specified, otherwise this would be as elegant as Haskell.