1

I noticed that the below code

    boolean hasFoundSurplusChangedSign = false;
    int h = 1;
    for(int k=0; k<size; k++){
        if (k==0){
             mVCBArray[k]=mVBArray[k]; 
        }else{
             mVCBArray[k]=mVCBArray[k-1]+mVBArray[k];
        }
        mMVTArray[k]= Math.min(mVCBArray[k],mVCAArray[k]);
        mSArray[k]= mVCBArray[k]-mVCAArray[k];
        if (!hasFoundSurplusChangedSign && k>0){
            if (Integer.signum(mSArray[k]) * Integer.signum(mSArray[k-1]) > 0){
                h = k+1;
            }else{
                hasFoundSurplusChangedSign = true;
            }
        }
    }

runs faster than this one :

    boolean hasFoundSurplusChangedSign = false;
    int h = 1;
    for(int k=0; k<size; k++){
        if (k==0){
             mVCBArray[k]=mVBArray[k]; 
        }else{
             mVCBArray[k]=mVCBArray[k-1]+mVBArray[k];
        }
        mMVTArray[k]= Math.min(mVCBArray[k],mVCAArray[k]);
        mSArray[k]= mVCBArray[k]-mVCAArray[k];
    }
    for(int k=0; k<size; k++){
        if (!hasFoundSurplusChangedSign && k>0){
            if (Integer.signum(mSArray[k]) * Integer.signum(mSArray[k-1]) > 0){
                h = k+1;
            }else{
                hasFoundSurplusChangedSign = true;
            }
        }
    }

all the Arrays are int arrays. The size of each array is constant and equal to 1000. the for loops iterate roughly 100 times (ie size = 100 roughly).

So, the first code runs in average in 6 microsecond while the second code runs in 3.5 microsecond. It seems that splitting the loop into two smaller loops improve the performance of my code here. Why?

Is it the compiler that compile differently the two versions of my code? I read somewhere that it could be because the processor cannot put the whole loop code in its cache and so needs to swap between different cache zone, while in the second case, it could, so it goes faster. I am not sure to understand this argument. Does that sounds possible to you? Any other ideas?

Thanks for your much needed help on this one.

Rominus
  • 103
  • 4
  • 1
    I suggest you ignore the first 10,000 iterations and run your test for at least 2 seconds before coming to any conclusions on performance from a benchmark. – Peter Lawrey Nov 15 '13 at 01:45
  • Duplicate - [Why is one loop so much slower than two loops](http://stackoverflow.com/q/8547778/823393) – OldCurmudgeon Nov 15 '13 at 01:49
  • Those average were calculated on 100,00 iteration ignoring more than the first 1,000,000 iteration. The code was runned for more than 5 minutes before measuring the performance. I may be wrong, but I don t think it is the reason of what I am seeing. – Rominus Nov 15 '13 at 01:52
  • 1
    This is classic cache thrash - alternating access between two arrays confuses all attempts at intelligence that the cache system is capable of. Walking one array from beginning to end and then the other from beginning to end is easily recognised and cached appropriately. – OldCurmudgeon Nov 15 '13 at 01:56
  • CPU timers are not accurate enough to for your results to be meaningful. Do as Peter says and run the tests for longer, and I'd run the test several times, with the order of the two samples reversed in some tests. – david25272 Nov 15 '13 at 02:07

0 Answers0