I usually think that preincrement is more efficient than postincrement in C++. But when I read the book Game Engine Architecture(2nd ed.) recently, there is a section says that postincrement is prefered than preincrement in for loop. Because, as I quote, "preincrement introduces a data dependency into your code -- the CPU must wait for the increment operation to be completed before its value can be used in the expression." Is this true? (It is really subverted my idea about this problem.)
Here is the quote from the section in case you are interested:
5.3.2.1 Preincrement versus Postincrement
Notice in the above example that we are using C++’s postincrement operator,
p++
, rather than the preincrement operator,++p
. This is a subtle but sometimes important optimization. The preincrement operator increments the contents of the variable before its (now modified) value is used in the expression. The postincrement operator increments the contents of the variable after it has been used. This means that writing++p
introduces a data dependency into your code -- the CPU must wait for the increment operation to be completed before its value can be used in the expression. On a deeply pipelined CPU, this introduces a stall. On the other hand, withp++
there is no data dependency. The value of the variable can be used immediately, and the increment operation can happen later or in parallel with its use. Either way, no stall is introduced into the pipeline.Of course, within the “update” expression of a
for
loop (for(init_expr; test_expr; update_expr) { ... }
), there should be no difference between pre- and postincrement. This is because any good compiler will recognize that the value of the variable isn’t used inupdate_expr
. But in cases where the value is used, postincrement is superior because it doesn’t introduce a stall in the CPU’s pipeline. Therefore, it’s good to get in the habit of always using postincrement, unless you absolutely need the semantics of preincrement.
Edit: Add "the above example".
void processArray(int container[], int numElements)
{
int* pBegin = &container[0];
int* pEnd = &container[numElements];
for (int* p = pBegin; p != pEnd; p++)
{
int element = *p;
// process element...
}
}
void processList(std::list<int>& container)
{
std::list<int>::iterator pBegin = container.begin();
std::list<int>::iterator pEnd = container.end();
std::list<inf>::iterator p;
for (p = pBegin; p != pEnd; p++)
{
int element = *p;
// process element...
}
}