2

I want to replace a for loop with std::transform. Since I have little experience with algorithms and lambda functions I wonder if this is the correct way

Original Code

for (size_t i=0; i < dataPhase.size(); ++i)
{
    dataPhase[i] = fmod(dataPhase[i], pi*1.00001);
}

std::transform with a lambda

std::transform(dataPhase.begin(), dataPhase.end(), dataPhase.begin(), 
               [](double v){return fmod(v, pi*1.00001); }
);

do I need a capture here?

What could I do to replace a for loop in such cases, where the index is used, as in this code:

const int halfsize = int(length/2);
for (size_t i=0; i < length; ++i)
{
    axis[i] = int(i) - halfsize;
}

EDIT: I would like to expand the question (if allowed).

Is it possible to replace the for loop in this case with something different

for(std::vector<complex<double> >::size_type i = 0; i != data.size(); i++) {
    dataAmplitude[i] = abs(data[i]);
    dataPhase[i]     = arg(data[i]);
}

Here not the original vector is modified, but its value used for two different vectors.

Matthias Pospiech
  • 3,130
  • 18
  • 55
  • 76
  • if the results you get, it is probably correct. What do you observe? Algorithms work on iterators not indices. There is no point in converting your second snippet to use a algorithm because the loop with index is more readable in that case – 463035818_is_not_an_ai Jul 12 '17 at 08:09
  • `std::transform` can also be used for types that do not have a clear index (like a linked list and so). As long as the first two parameters satisfy the "input iterator" category and the third parameter satisfies the "output iterator" requirement. Hence there is no (conceptual) index in the function. – JHBonarius Jul 12 '17 at 08:56
  • @Matthias Pospiech before trying to remove all your for loops, maybe you can have a look at my remark #1.... Most of the time this will prevent you to use OpenMP. – Picaud Vincent Jul 12 '17 at 21:06

3 Answers3

4

Part 1)

You do not need a capture here because you are only using parameters (v) and globals (pi) in the lambda code.

A capture is only needed if the lambda has to access variables from the current scope (i.e. declared in your function). You can capture by reference (&) or by value (=).

Here is an example where a 'capture by reference' is needed because of 'result' being modified from within the lambda (but it also captures the 'searchValue'):

size_t count(const std::vector<char>& values, const char searchValue)
{
 size_t result = 0;
 std::for_each(values.begin(), values.end(), [&](const char& v) {
  if (v == searchValue)
   ++result;
 });
 return result;
}

(In real world please use std::count_if() or even std::count())

The compiler creates an unnamed functor (see this question) for each capturing lamda. The constructor of the function takes the parameters and stores it as member variables. So a 'capture by value' always uses the value the element had at the time the lambda was defined.

Here is an example of a code the compiler could generate for the lambda we created earlier:

class UnnamedLambda
{
public:
 UnnamedLambda(size_t& result_, const char& searchValue_)
  : result(result_), searchValue (searchValue_)
 {}

 void operator()(const char& v)
 {
  // here is the code from the lambda expression
  if (v == searchValue)
   ++result;
 }

private:
 size_t& result;
 const char& searchValue;
};

and our function could be rewritten to:

size_t count(const std::vector<char>& values, const char searchValue)
{
 size_t result = 0;
 UnnamedLambda unnamedLambda(result, searchValue);
 for(auto it = values.begin(); it != values.end(); ++it)
  unnamedLambda(*it);
 return result;
}

Part 2)

If you need the index just continue using a for loop.

std::transform allows processing single elements and therefore does not provide an index. There are some other algorithms like std::accumulate which work on an intermediate result but I do not know anything that provides an index.

Andreas H.
  • 1,757
  • 12
  • 24
0

Here are some examples of lambda captures:

[] Capture nothing

[&] Capture any referenced variable by reference

[=] Capture any referenced variable by making a copy

[=, &foo] Capture any referenced variable by making a copy, but capture variable foo by reference

[bar] Capture bar by making a copy; don't copy anything else

[this] Capture the this pointer of the enclosing class

Hence if pi in your example is a local variable (not a macro or a global variable) you can not let [] but use [pi] to capture by copy (which is sane for a double):

std::transform(dataPhase.begin(), dataPhase.end(), dataPhase.begin(), 
           [pi](double v){return fmod(v, pi*1.00001); }
);

For your second example there is no built-in std::transform providing the index. I think the better solution is to keep your for loop.

If you really want to use lambda (as an exercise) you can use:

const int halfsize = int(length/2);

auto lambda=[&axis,halfsize](const int i){axis[i] = i - halfsize;};

for (size_t i=0; i < length; ++i)
{
    lambda(i);
}

or

const int halfsize = int(length/2);

auto lambda=[halfsize](const int i){return i - halfsize;};  

for (size_t i=0; i < length; ++i)
{
   axis[i] = lambda(i);
}

It only depends on how you want to design your code.

Remark 1: it seems that you want to avoid "basic" for loops, however they are not necessary evil especially if you want to use OpenMP to gain some performances (simd or multi-threading). For instance

 #pragma omp simd
 for(auto& v_i :v) {   // or worst std::transform
    v_i = fmod(v_i, pi*1.00001);
 }

is not supported and will not compile.

However

 #pragma omp simd
 for (size_t i=0; i < dataPhase.size(); ++i)
 {
     dataPhase[i] = fmod(dataPhase[i], pi*1.00001);
 }    

can be compiled with g++ -fopenmp ... with potential perf gain if simd can be used. Concerning multi-threading, one can argue that there is a coming support for parallel execution of the STL algorithms, but this is only for C++17.

Remark 2: not in C++ but in D language you have a foreach instruction that let you optionally include the index:

foreach (e; [4, 5, 6]) { writeln(e); }
// 4 5 6

but

foreach (i, e; [4, 5, 6]) { writeln(i, ":", e); }
// 0:4 1:5 2:6
Picaud Vincent
  • 10,518
  • 5
  • 31
  • 70
  • I do not understand why I should capture pi. It is a constant. And it us used but not returned. And since it is a consant can not even be modified. If I wanted to modify pi it would make sence to capture it. – Matthias Pospiech Jul 12 '17 at 09:04
  • If you do not use [pi] you will get a compiler error (for gcc, something like "error: ‘pi’ is not captured). Even if not modified this [pi] defines the "context" for your lambda. For instance if you want to copy your lambda to another place you also have to copy **pi**. This context is necessary to perform the **closure** operation. – Picaud Vincent Jul 12 '17 at 09:30
  • If pi is a global constant (declared outside a function ar class) you do not have to capture it. A (capturing) lambda is like a method call. Only if this method has to access any of the *local* variables you have to capture them. A not-capturing lambda works like a global function and can even be used in places where global functions are expected (in current C++). – Andreas H. Jul 12 '17 at 10:29
  • @AndreasH. you are right, I assumed a *local* variable. Thank you for the clarification. – Picaud Vincent Jul 12 '17 at 10:35
0

All of your examples can be transformed into a use of std::transform, with some additional objects doing legwork (I use boost here because it is prior art for most of the classes needed)

// 1
for (size_t i=0; i < dataPhase.size(); ++i)
{
    dataPhase[i] = fmod(dataPhase[i], pi*1.00001);
}

// 2
const int halfsize = int(length/2);
for (size_t i=0; i < length; ++i)
{
    axis[i] = int(i) - halfsize;
}

// 3
for(std::vector<complex<double> >::size_type i = 0; i != data.size(); i++) {
    dataAmplitude[i] = abs(data[i]);
    dataPhase[i]     = arg(data[i]);
}

As you correctly note, 1 becomes

std::transform(dataPhase.begin(), dataPhase.end(), dataPhase.begin(), 
    [](double v){return fmod(v, pi*1.00001); } );

2 needs a sequence of numbers, so I use boost::integer_range

const int halfsize = int(length/2);
 // provides indexes 0, 1, ... , length - 1
boost::integer_range<int> interval = boost::irange(0, length);
std::transform(interval.begin(), interval.end(), axis.begin(), 
    [halfsize](int i) {return i - halfsize;}); 

3 involves a pair (2-tuple) of outputs, so I use boost::zip_iterator as a destination

std::transform(data.begin(), data.end(), 
    // turns a pair of iterators into an iterator of pairs
    boost::make_zip_iterator(dataAmplitude.begin(), dataPhase.begin()), 
    [](complex<double> d) { return boost::make_tuple(abs(d), arg(d)); });
Caleth
  • 52,200
  • 2
  • 44
  • 75
  • Ok. I am interested in replacing for loops for two reasons: readability and speed for very high array (and future multithreading support of standard algorihms). 1) looks like a reasonable replacement, however 2 and especially 3) lack readability and introduce unnecessary complexity. Therefore very interesting, but nothing I consider to implement. – Matthias Pospiech Jul 12 '17 at 14:23
  • For 2, you could instead use `boost::irange(-halfsize, halfsize)` and `std::copy`, but I expressed these all using `std::transform` – Caleth Jul 12 '17 at 14:23
  • @MatthiasPospiech I'm not sure how you are deciding "unnecessary complexity", except as "introduces 3rd party library". Readability is subjective, you can get used to the "functional" style – Caleth Jul 12 '17 at 14:36