4

I have started learning multi-core programming and developing parallel algorithms. This can be done with ease by the use of multithreading in Java. So, I created two text files with 10 lines of content as follows:

This is the first line in file 1
This is the second line in file 1
This is the third line in file 1 
This is the fourth line in file 1
This is the fifth line in file 1
This is the sixth line in file 1
This is the seventh line in file 1
This is the eighth line in file 1
This is the ninth line in file 1
This is the tenth line in file 1

Similarly, in the other text file, the file 1 is replaced by file 2. I wrote a program to read the contents of the file, with and without threads. They are as follows:

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;

public class SimpleThread {

    static void printFile(BufferedReader br) throws Exception
    {
        for(String line; (line = br.readLine())!=null; )
            System.out.println(line);
    }

    public static void main(String args[]) throws Exception
    {
        double startTime = System.nanoTime();
        BufferedReader br1 = new BufferedReader(new FileReader(new File("test1.txt")));
        BufferedReader br2 = new BufferedReader(new FileReader(new File("test2.txt")));
        SimpleThread.printFile(br1);
        SimpleThread.printFile(br2);
        System.out.println(System.nanoTime() - startTime + "ns");
    }
}

The program using multithreading is as follows:

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;

public class Threading extends Thread{

    BufferedReader br;

    public Threading(String fileName)
    {
        try{
        br = new BufferedReader(new FileReader(new File(fileName)));
        start();
        }
        catch(Exception e)
        {
            System.out.println(e.getMessage());
        }
    }

    private void printFile(BufferedReader br) throws Exception
    {
        for(String line; (line = br.readLine())!=null; )
            System.out.println(line);
    }

    public void run()
    {
        try{
        printFile(br);
        }
        catch(Exception e)
        {
            System.out.println(e.getMessage());
        }
    }

    public static void main(String args[]) throws Exception
    {
        double startTime = System.nanoTime();
        Threading t1 = new Threading("test1.txt");
        Threading t2 = new Threading("test2.txt");
        System.out.println(System.nanoTime() - startTime + "ns");
    }
}

Now, when I compare the execution time of both the programs, I see that the program which is single threaded takes 1544589.0ns and the multithreaded program takes 410522.0ns.

I was curious to know the factor by which the speed was increased. I found it to be 0.23 approximately.

After revising the code that uses multiple threads, I found that a single threaded program executes faster and this has increased my confusion to a greater extent.

Here is the revised code:

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;

public class Threading extends Thread{

    BufferedReader br;

    public Threading(String fileName)
    {
        try{
        br = new BufferedReader(new FileReader(new File(fileName)));
        start();
        }
        catch(Exception e)
        {
            System.out.println(e.getMessage());
        }
    }

    private void printFile(BufferedReader br) throws Exception
    {
        for(String line; (line = br.readLine())!=null; )
            System.out.println(line);
    }

    public void run()
    {
        try{
        printFile(br);
        }
        catch(Exception e)
        {
            System.out.println(e.getMessage());
        }
    }

    public static void main(String args[]) throws Exception
    {
        double startTime = System.nanoTime();
        Threading t1 = new Threading("test1.txt");
        Threading t2 = new Threading("test2.txt");
        t1.join(); //waiting for t1 to finish
        t2.join(); //waiting for t2 to finish
        System.out.println(System.nanoTime() - startTime + "ns");
    }
}

And now the execution time are : Single Threaded - 1459052.0ns
Multithreaded - 1768651.0ns

Why is the system behaving in an unnatural way?

Now, my questions are :

  1. Will increasing the number of threads, reduce the execution time?
  2. When should one use multithreading in writing programs
  3. Can the same concept of file be ported to databases, where every thread reads a portion of the database based on the category say information on news, sports, politics, etc. will be read by the corresponding threads and finally the results will be clubbed together. Is this feasible?
  4. Should mutlithreading be used only for CPU bound programs?
Java Enthusiast
  • 1,161
  • 2
  • 15
  • 30
  • 4
    Your multithreaded program doesn't wait until the worker threads finish their jobs before reporting runtime. – user2357112 Apr 04 '15 at 04:38
  • there are so many pros and cons depending on situations so i think you should read some book on multi-threading. – Prashant Apr 04 '15 at 05:02

2 Answers2

4

I was curious to know the factor by which the speed was increased. I found it to be 0.23 approximately.

That is because your multi-threaded test is invalid. It doesn't actually measure the time taken by the threads. Instead it is just measuring the time to launch the threads.

The other test is invalid too. You are not taking account of JVM warmup effects ... and the amount of work that the test is doing is not enough to be indicative.

Another issue is that the time taken to read a file (on Linux for example) depends on whether the OS has already cached it. So if you run one of your tests and then run it again, you could well find that it is significantly faster the second time!

And now the execution time are : Single Threaded - 1459052.0ns Multithreaded - 1768651.0ns

Why is the system behaving in an unnatural way?

This is actually what I would expect to happen .... for that version of the benchmark. It seems that overheads of creating the two threads exceeds any (hypothetical) speedup due to reading using two threads.


Your questions:

Q1. Will increasing the number of threads, reduce the execution time?

It may do. It depends on how many cores you have, whether the threads are CPU or I/O bound, whether there is contention over data structures or resources, etcetera.

Q2. When should one use multi-threading in writing programs

When performance is a concern, AND the problem can reasonably be divided into subtasks that can be performed in parallel. Also, for small problems, the overheads of setting up the threads can exceed any possible performance gains.

Q3. Can the same concept of file be ported to databases, where every thread reads a portion of the database based on the category say information on news, sports, politics, etc. will be read by the corresponding threads and finally the results will be clubbed together. Is this feasible?

Maybe.

However, your (invalid) tests are probably giving you a misleading idea of the benefits of multi-threading. In reality, anything involving reading or writing to a disc is limited by the fact that a disc drive has a single read/write "head" and it can only do one hardware-level read or write operation at a time. There are various tricks that an operating system or a database system can do to give the impression of faster performance, but if the application pushes hard enough, you hit that wall.

In short, there is only a limited amount of speedup that is theoretically possible.

Q4. Should multi-threading be used only for CPU bound programs?

No.

But that doesn't mean that multi-threading should be used for everything.

It shouldn't even be used for all CPU bound programs!

Simple generalizations do not apply. It is way more complicated.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • Thank you so much for the answer. I am satisfied except for one thing. How do I measure the execution time in a multithreaded environment (or) see the performance? – Java Enthusiast Apr 04 '15 at 05:35
  • You can only see a speed-up if it actually exists. In this case, my gut feeling is that there *shouldn't be* any speedup. Certainly not in a benchmark that is as tiny as this one. – Stephen C Apr 04 '15 at 05:50
  • But if you are asking how to write a *valid* benchmark that will be able to a measure a speedup *if it actually exists* ... you should start by reading https://stackoverflow.com/questions/504103 which explains the common mistakes that people make in Java benchmarks and how to avoid them. – Stephen C Aug 11 '21 at 00:56
0

Threading is a concept which assists in parallel execution. Many a times we see that the CPU is idle, which its processing speed is much more than human or for that matter small code snippets. When we introduce threads, we try to reduce CPU's idle time by making sure that it has enough sets of instructions to be executed.

lets take an example;

In every transaction, we do have a bit of pre & post basic steps to follow before we actually enter the core business logic (which is carried out by CPU). during these activities the CPU is idle. With multi threading we make sure that while one thread is actually being processed, the other one's pre/post activities are also executed simultaneously, thus second task can be taken for processing as soon as the processing for the first is over.

Coming to your next question, we should implement multi-threading for basic functions and not core business logic, as it could have adverse effects.

If possible we always try to introduce parallel executions for costly activities (which take more time).

Saurabh Jhunjhunwala
  • 2,832
  • 3
  • 29
  • 57