In the constructor I have piece of code which does the following:
code snippet 1
for (unsigned i=0; i<n; ++i) {
auto xIdIt = _originIdSet.find(xIds[i]);
auto yIdIt = _originIdSet.find(yIds[i]);
if (xIdIt == _originIdSet.end()) { _originIdSet.insert(xIds[i]); }
if (yIdIt == _originIdSet.end()) { _originIdSet.insert(yIds[i]); }
}
_originIdSet
is of the type std::unordered_set<uint32_t>
xIds
and yIds
is of the type std::vector<uint32_t>
xIds and yIds contain lots of duplicate entries, e.g.
xIds = {1,2,1,2,3,5,....}
yIds = {2,3,4,1,4,1,....}
xIds[i]
never equals yIds[i]
I'm compiling with gcc 5.30 as follows: g++ -g -Wall -m64 -O3 -std=c++11
I've been profiling how much time this piece of code (i.e. code snippet 1) takes when n
equals 10k^2 and I've found that if I change the code to:
code snippet 2
for (unsigned i=0; i<n; ++i) {
// We cannot insert duplicates in a set so why check anyway
_originIdSet.insert(xIds[i]);
_originIdSet.insert(yIds[i]);
}
It will be around 5 seconds slower (total run code-snippet 1 takes around 15 seconds).
I'm wondering what is the underlying reason for the performance decrease. My first guess would be that this is caused due branch optimization (excellent explanation here), however I believe it doesn't make sense that in this situation branch optimization would only be applied if an if/else conditions is used. Hopefully somebody can clarify what is happening here.