2

I'm trying to understand how to take advantage of using multi threads. I wrote a simple program that increments the value of i, let's say, 400,000 times using two ways : a single threaded way (0 to 400,000) and a multiple threaded way (in my case, 4 times : 0 to 100,000) with the number of thread equal to Runtime.getRuntime().availableProcessors().

I'm surprised with the results I measured : the single threaded way is decidedly faster, sometimes 3 times faster. Here is my code :

public class Main {
    public static int LOOPS = 100000;
    private static ExecutorService executor=null;

    public static void main(String[] args) throws InterruptedException, ExecutionException {

        int procNb = Runtime.getRuntime().availableProcessors();
        long startTime;
        long endTime;

        executor = Executors.newFixedThreadPool(procNb);
        ArrayList<Calculation> c = new ArrayList<Calculation>();

        for (int i=0;i<procNb;i++){
            c.add(new Calculation());
        }

        // Make parallel computations (4 in my case)
        startTime = System.currentTimeMillis();
        queryAll(c);
        endTime = System.currentTimeMillis();

        System.out.println("Computation time using " + procNb + " threads : " + (endTime - startTime) + "ms");

        startTime = System.currentTimeMillis();
        for (int i =0;i<procNb*LOOPS;i++)
        {

        }
        endTime = System.currentTimeMillis();
        System.out.println("Computation time using main thread : " + (endTime - startTime) + "ms");
    }

    public static List<Integer> queryAll(List<Calculation> queries) throws InterruptedException, ExecutionException {
        List<Future<Integer>> futures = executor.invokeAll(queries);
        List<Integer> aggregatedResults = new ArrayList<Integer>();
        for (Future<Integer> future : futures) {
            aggregatedResults.add(future.get());
        }
        return aggregatedResults;
    }

}

class Calculation implements Callable<Integer> {

    @Override
    public Integer call() {
        int i;
        for (i=0;i<Main.LOOPS;i++){
        }
        return i;
    }
}

Console :

Computation time using 4 threads : 10ms. Computation time using main thread : 3ms.

Could anyone explain this ?

ThomasM
  • 31
  • 6
  • Don't you think you are doing too much in multi thread? creating future, adding future to list? Also its not mandatory that multithread would always be better than single thread. – SMA Jan 14 '15 at 11:10
  • I guess creating multiple threads takes longer than incrementing value. – peterremec Jan 14 '15 at 11:14
  • 1
    Of course multi-threading has an overhead. You need a problem big enough to get multi-threading advantages. It also depends on the platform, hardware (multi-core) and implemention used (Java8 Streams can make huge usage of multi-cores). – PeterMmm Jan 14 '15 at 11:14
  • Also, in order to get a speed up from using multiple threads, the calculation of one thread must not be dependent on or blocked by the results of another threads calculations. – Simon Jan 14 '15 at 11:19

1 Answers1

9

An addition probably takes one cpu cycle, so if your cpu runs at 3GHz, that's 0.3 nanoseconds. Do it 400k times and that becomes 120k nanoseconds or 0.1 milliseconds. So your measurement is more affected by the overhead of starting threads, thread switching, JIT compilation etc. than by the operation you are trying to measure.

You also need to account for the compiler optimisations: if you place your empty loop in a method and run that method many times you will notice that it runs in 0 ms after some time,. because the compiler determines that the loop does nothing and optimises it away completely.

I suggest you use a specialised library for micro benchmarking, such as jmh.

See also: How do I write a correct micro-benchmark in Java?

Community
  • 1
  • 1
assylias
  • 321,522
  • 82
  • 660
  • 783
  • Thanks for your answers. Even when I multiplied the number of loops by ten, the result was the same. I think it was also really important to emphasize my "for" loop being empty, the compiler optimizes its calculation... And my benchmark was biased ! – ThomasM Jan 14 '15 at 14:52