It totally depends on you, your code, and your compiler. Imagine you have:
#include <vector>
int frob (int a, int b) {
return a + b;
}
int main () {
std::vector<int> results(20), lhs(20), rhs(20);
for (int i=0; i<20; ++i) {
results[i] = frob(lhs[i], rhs[i]);
}
}
Now if your compiler optimizes for size, it might leave this as is. But if it
optimizes for performance, it may (or may not, some compilers use rough heuristic measures
to determine that) transform this to:
int main () {
std::vector<int> results(20), lhs(20), rhs(20);
for (int i=0; i<20; ++i) {
results[i] = lhs[i] + rhs[i];
}
}
If it optimizes even more, it might unroll the loop
int main () {
std::vector<int> results(20), lhs(20), rhs(20);
for (int i=0; i<20; i+=4) {
results[i] = lhs[i] + rhs[i];
results[i+1] = lhs[i+1] + rhs[i+1];
results[i+2] = lhs[i+2] + rhs[i+2];
results[i+3] = lhs[i+3] + rhs[i+3];
}
}
Size increased. But if the compiler now decides to also to a bit of auto vectorization,
it might transform again into a something not unsimilar to this:
int main () {
std::vector<int> results(20), lhs(20), rhs(20);
for (int i=0; i<20; i+=4) {
vec4_add (&results[i], &lhs[i], &rhs[i]);
}
}
Size decreased.
Next on, the compiler, smart as always, unrolls again and kills the loop entirely:
int main () {
std::vector<int> results(20), lhs(20), rhs(20);
vec4_add (&results[i], &lhs[i], &rhs[i]);
vec4_add (&results[i+4], &lhs[i+4], &rhs[i+4]);
vec4_add (&results[i+8], &lhs[i+8], &rhs[i+8]);
vec4_add (&results[i+12], &lhs[i+12], &rhs[i+12]);
vec4_add (&results[i+16], &lhs[i+16], &rhs[i+16]);
}
An optimization g++ will exercise if it can conclude enough is to replace a vector
with an ordinary array
int main () {
int results[20] = {0}, lhs[20] = {0}, rhs[20] = {0};
vec4_add (&results[i], &lhs[i], &rhs[i]);
vec4_add (&results[i+4], &lhs[i+4], &rhs[i+4]);
vec4_add (&results[i+8], &lhs[i+8], &rhs[i+8]);
vec4_add (&results[i+12], &lhs[i+12], &rhs[i+12]);
vec4_add (&results[i+16], &lhs[i+16], &rhs[i+16]);
}
It sees how everything is constant, and folds
int main () {
int results[20] = {0}; // because every lhs[0]+rhs[0] == 0
}
It concludes that results is actually unused, and finally spits out:
int main() {
}