So the 'new (old) big thing' is "No Raw Loops" in C++. I'm trying to write code that way, but it seems very inefficient. Yes, there are STL algoritms which can do about anything, but they don't seem very efficient.
I for instance have a situation where I want a pointer to a node in an array of nodes that has the highest score. Determining that score is a costly floating-point operation. So I implemented the STL algorithm version and compared it with the raw loop:
#include <cfloat>
#include <iostream>
#include <array>
#include <algorithm>
#include <numeric>
static int counter;
class Node {
public:
auto Score() const -> double {
std::cout << "complex calculation\n";
counter++;
return 1;
}
};
int main()
{
std::array<Node, 10> nodes;
counter = 0;
Node const* nodePtr = std::max_element(std::cbegin(nodes), std::cend(nodes),
[](Node const& node1, Node const& node2) {
return node1.Score() < node2.Score();
});
std::cout << "algorithm count " << counter << std::endl;
counter = 0;
double maxScore = -FLT_MAX;
for (const auto& node : nodes) {
auto score = node.Score();
if (score > maxScore) {
maxScore = score;
nodePtr = &node;
}
}
std::cout << "raw loop count " << counter << std::endl;
}
Evaluating this, for the STL version, the costly Score function is evaluated 18 times, while the raw loop only uses 10 evaluations...
Am I doing it wrong, or are raw loops just not that bad?
edit:
After the suggestion by user58697
that cout and the static counter would prevent compiler optimization, I changed the code:
#include <cfloat>
#include <cmath>
#include <iostream>
#include <array>
#include <algorithm>
#include <numeric>
#include <random>
#include <chrono>
template <typename T>
class Random {
private:
std::default_random_engine generator;
std::uniform_real_distribution<T> distribution;
public:
Random()
: generator()
, distribution(0.0, 1.0)
{}
auto operator()() {
return distribution(generator);
};
};
static Random<double> myRandom;
class Timer {
private:
std::chrono::high_resolution_clock::time_point startTime{};
public:
void Start() noexcept {
startTime = std::chrono::high_resolution_clock::now();
}
[[nodiscard]] auto ElapsedMs() const noexcept {
return std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::high_resolution_clock::now() - startTime).count();
}
};
static Timer timer;
class Node {
private:
double val;
public:
Node() noexcept : val(myRandom()) {}
[[nodiscard]] auto Score() const noexcept {
auto score = std::sqrt(std::log(10.0 / val));
score = std::sin(score) / std::cos(score);
score = std::sqrt(std::sqrt(std::sqrt(std::sqrt(std::sqrt(score)))));
score = std::pow(score, 1000);
return score;
}
};
int main()
{
std::array<Node, 100000> nodes; // yeah, yeah... overloading the stack, I know
for (auto i = 0; i < 2; i++) {
timer.Start();
Node const* nodePtr = &*std::max_element(std::cbegin(nodes), std::cend(nodes),
[](Node const& node1, Node const& node2) {
return node1.Score() < node2.Score();
});
std::cout << "algorithm elapsed time " << timer.ElapsedMs() << std::endl;
timer.Start();
double maxScore = -FLT_MAX;
for (const auto& node : nodes) {
auto score = node.Score();
if (score > maxScore) {
maxScore = score;
nodePtr = &node;
}
}
std::cout << "raw loop count " << timer.ElapsedMs() << std::endl;
}
}
I run the loop twice to eliminate startup behavior... results of second loop (compiled with g++ 9.1 -O3):
algorithm elapsed time 16
raw loop count 8 (<== I see I forgot to change "count" to "time" :P)
So that's not it.
edit: Seeing this question got an upvote, people are still looking at it. Since this question was asked, C++20 was released. C++20's ranges library has a special feature which could have helped here, called Projection.
I.e. in this case you could use std::ranges::max_element
or even std::ranges::max
(which was missing from the old std algorithms) like
Node const* node = &*std::ranges::max_element(nodes, {}, &Node::Score);
...
Node const& node = std::ranges::max(nodes, {}, &Node::Score);
However, projection is not the solution here, due to the implementation choice, which doesn't use caching. The Proj
projection function is called again and again for every argument of the Comp
comparator function.
(The internal function call looks something like
return std::invoke(__comp, std::invoke(__proj, __a), std::invoke(__proj, __b)) ? __b : __a;
)