0

I am running a very big for loop with 10 million iterations. When I do this in one go it takes 14 secs while when I break it into 20 iterations of 500k small iterations it takes only 6 secs. I am not able to understand why is there such behavior. Is there any problem with my code? Thanks!

Code

public class Benchmark {

static int max = 10000000;
static int start = 0;
static int end = 0;
static boolean dnc = false;

public static void main(String[] args) {
    TimeIt timer = new TimeIt();
    timer.printTime("bmTimer dnc false", () -> bmTimer());
    dnc = true;
    sum = 0;
    timer.printTime("bmTimer dnc true", () -> bmTimer());
}

private static void bmTimer() {
    if (dnc) {
        int factor = 500000;
        for (int i = 0; i < max; i += factor) {
            end = start + factor;
            bm(start, end);
            start = end + 1;
        }
    } else {
        bm(0, max);
    }
}

static int sum = 0;

private static void bm(int start, int end) {
    try {
        ExecutorService executor = Executors.newFixedThreadPool(4);
        List<Future<String>> futures = new ArrayList<>();
        for (int j = start; j < end; j++) {
            futures.add(executor.submit(new Callable<String>() {
                @Override
                public String call() throws Exception {
                    int i = 10;
                    int j = 9;
                    return (i - j) + "";
                }
            }));
        }
        for (Future<String> future : futures) {
            sum += Integer.parseInt(future.get());
        }
        System.out.println(sum);
        executor.shutdown();
        executor.awaitTermination(1, TimeUnit.DAYS);
    } catch (Exception e) {
        e.printStackTrace();
    }
}

Output

10000000
Method bmTimer dnc false took : 14.39s
500000
1000000
1500000
2000000
2500000
3000000
3500000
4000000
4500000
5000000
5500000
6000000
6500000
7000000
7500000
8000000
8500000
9000000
9500000
10000000
Method bmTimer dnc true took : 5.856s
Amit Kumar
  • 2,685
  • 2
  • 37
  • 72
  • 1
    it because your pc has multiple procesor cores and every core can handle it's own operation so this means things are faster if you split it over multiple processor cores. – Joel Harkes Apr 12 '18 at 10:22
  • 2
    First of all, obligatory link to [How do I write a correct micro-benchmark in Java](https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java). Most of the "why performance" questions are simply based on invalid benchmarks. – lexicore Apr 12 '18 at 10:23
  • I would say the difference in time and the fact that changing the order of operation does not change anything in the timing does give this 'benchmark' a bit more validity. It's probably still skewed but I am as well interested in the reasoning for the difference. – Ben Apr 12 '18 at 10:26
  • 1
    I just ran with Executors.newFixedThreadPool(1) and still there is a big difference, for 10000000 iterations timings are 12 sec and 5 sec respectively. – Amit Kumar Apr 12 '18 at 10:30
  • Could you try constructing array list with capacity, i.e. `List> futures = new ArrayList<>(end-start+1);` – Sergey Kalinichenko Apr 12 '18 at 10:33
  • @dasblinkenlight tried it and added it as comment to your answer - no actual big change visible. – Ben Apr 12 '18 at 10:36
  • So, tried around a bit. Let the Kernel warm up, did 100 test runs, `-verbose:gc` on and so on and.... @lexicore was right, seems to be an issue with the benchmark. In the end numbers were really comparable with one being a tad faster than the other here and there. They averaged out to 1.6s and 1.46s on my machine which does seem minor enough to just be some bad luck on GC. – Ben Apr 12 '18 at 11:14
  • @Ben, Thanks for investigation. What is the hardware you tried with? Did you try increasing max such that time taken by any one say dnc=true is more than 10sec? – Amit Kumar Apr 12 '18 at 11:19

0 Answers0