1

I wrote a program in Java to print 10 hundred thousand in a for loop.

for (int i =0;i<1000000;i++){
     System.out.println(i);
}

It took around 7.5 seconds.

I wrote a custom class in Java that implements Runnable interface, it takes 2 parameters as a limit to print the values between 2 values.

public class ThreadCustom implements  Runnable {

    int start;
    int end;
    String name;


    ThreadCustom(int start, int end, String name){
        this.start = start;
        this.end = end;
        this.name = name;

    }
    @Override
    public void run() {
       for(int i =start; i<=end;i++){
            System.out.println(i);
        }
    }
}

I created 10 objects of my custom thread class, assigned each object a chunk of 100k numbers to print so at the end I get all the 10 hundred thousands printed (not in order definitely) but it takes around 9.5 seconds.

What's the reason for this 2 seconds delay? Is that because of time slicing and context switching that takes place between threads? I am executing a java process and it spawns 10 threads. Am I thinking in the right direction?

Updated: commented System.out.println to see how it performs when there is an iteration.

Printed time without threads

2019-04-14 22:18:07.111   // start
2019-04-14 22:18:07.116 // end

Using ThreadCustom class:

2019-04-14 22:26:42.339
2019-04-14 22:26:42.341
Gray
  • 115,027
  • 24
  • 293
  • 354
Danyal Sandeelo
  • 12,196
  • 10
  • 47
  • 78
  • 1
    My first guess would be locking of several threads at a time trying to perform the output. However, it's not really a proper benchmark. – daniu Apr 14 '19 at 17:06
  • 1
    `System.out.println` is thread-safe, so all of your threads are constantly in contention. – Jacob G. Apr 14 '19 at 17:08
  • @JacobG. so it causes delay because it allows one thread to use it at a time? – Danyal Sandeelo Apr 14 '19 at 17:09
  • @DanyalSandeelo I'd say yes, as only one thread can be printing at a given time. I'd assume that the synchronization overhead is what's causing the delay. – Jacob G. Apr 14 '19 at 17:12
  • @JacobG. I see, I am aware of the fact that threads are beneficial when the operations are IO intensive not CPU intensive. In the given scenario, the process is one so at a time one even if I do not print only one thread would have the time slice to perform iteration ..right? – Danyal Sandeelo Apr 14 '19 at 17:14
  • You are thinking in terms of a computer with one CPU. Those are almost non-existent today. Of course you may not have 10 available cores in your computer, but up to the number of free cores, your statement is wrong. And in IO intensive tasks, you must make sure that different IO resources are accessed. – RealSkeptic Apr 14 '19 at 17:17
  • @RealSkeptic this java process would execute on 1 core, no? be it any one among free. – Danyal Sandeelo Apr 14 '19 at 17:20
  • Each thread will run on its own core, unless there are not enough cores available, in which case there will be time slicing and context switching between the existing cores. – RealSkeptic Apr 14 '19 at 17:54
  • @RealSkeptic yes, I get the idea now. Thanks! – Danyal Sandeelo Apr 15 '19 at 08:03
  • And see https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java ... I am tempted to close this one of as DUP to that. – GhostCat Apr 15 '19 at 08:08
  • @GhostCat writing benchmarks is fine but let's keep this open since it more of conceptual understanding of the way threads get processed by OS on multicore processors. – Danyal Sandeelo Apr 15 '19 at 08:20
  • 1
    You are welcome always glad to help, in whatever way helps ;-) – GhostCat Apr 15 '19 at 08:39
  • map reduce is multi tasking on different nodes. That's definitely more powerful but needs resources. Multicore processors would do some good part of the job. – Danyal Sandeelo Apr 17 '19 at 06:24

2 Answers2

1

The extra time is spent in two ways: 1) the overhead involved in setting up each threads execution context 2) the likely scenario that you are spawning more threads than there are logical processors available in your main processor

Since the amount of processing required to increment a loop and print an integer is minimal, this will, in the majority of cases result in degraded performance in a parallel environment.

If you were however to do something like count the distinct pixel colors on any given image during each iteration, you would see a significant performance advantage when using multiple threads.

1

I wrote a program in Java to print [1 million] in a for loop... I created 10 objects of my custom thread class, ... but it takes around 9.5 seconds. What's the reason for this 2 seconds delay?

Threads are only faster if they can work independently. In the case of printing numbers to System.out, all of the threads are trying to contest for access to the same resource System.out which is a synchronized PrintStream. This means that most of the time is wasted waiting for another thread to release the lock on System.out. Any additional "delay" with threaded programs is most likely because of the lock contention and the context switching between the threads.

To test thread speed appropriately, you need to run some sort of independent CPU task in each thread. Calculating Math.sqrt(...) a bunch of times is a better example. On my newer Macbook, I can do 1 billion (with a b) Math.sqrt(...) calls in ~8.1 seconds but 10 threads can each do 100 million in ~1.1 seconds in parallel. But wait, you might say, 10 * 1.1 > 8 seconds of total CPU. I have 4 cores, so with 10 threads running, there is a lot of in and out of the CPUs. 4 threads doing 250m each take 2.1 seconds which is a lot closer to 8.1 secs with the single thread example.

Lastly, Java performance testing is really hard. I bet if you ran your two programs a number of times you would see some different results. Any program that runs quickly is really not a good judge of speed or at best is a very rough approximation. Also, you need to be careful else the hotswap compiler might optimize your loops away at runtime so you need to try to do actual work.

Gray
  • 115,027
  • 24
  • 293
  • 354
  • 1
    Thanks for the explanation. I removed System.out.println and the stats have changed. Thread one is finishing quickly in all the runs that I checked. So basically, the threads perform execution separately on multi processors. – Danyal Sandeelo Apr 15 '19 at 08:03