I found that sometimes it's faster to divide one loop into two or more
for (i=0; i<AMT; i++) {
a[i] += c[i];
b[i] += d[i];
}
||
\/
for (i=0; i<AMT; i++) {
//a[i] += c[i];
b[i] += d[i];
}
for (i=0; i<AMT; i++) {
a[i] += c[i];
//b[i] += d[i];
}
On my desktop, win7, AMD Phenom(tm) x6 1055T, the two-loop version runs faster with around 1/3 time less time.
But if I am dealing with assignment,
for (i=0; i<AMT; i++) {
b[i] = rand()%100;
c[i] = rand()%100;
}
dividing the assignment of b and c into two loops is no faster than in one loop.
I think that there are some rules the OS use to determine if certain codes can be run by multiple processors.
I want to ask if my guess is right, and if I'm right, what are such rules or occasions that multiple processors will be automatically (without thread programming) used to speed up my programmes?