0

Here are two pieces of code like this:

FileInputStream is = new FileInputStream(tmp);
byte[] buf = new byte[1024];
while (is.read(buf) > -1) {
}

and

BufferedInputStream is = new BufferedInputStream(new FileInputStream(tmp),1024);
while (is.read() > -1) {
}

It seems from BufferedInputStream source code that they will cost the same time, but actually the first way runs much faster (166ms vs 5159ms on a 200M file). Why?

IQV
  • 500
  • 6
  • 14
Cherokee
  • 143
  • 8
  • 3
    Because you're making `fileSize/1024` method invocations in the first case, but `fileSize` method invocations in the second case. You may not be doing anything different in terms of the actual IO operations, but invoking a method isn't free. – Andy Turner Mar 02 '18 at 08:34
  • 2
    Also, potentially, you're not doing the measurements of time in a way which gives meaningful numbers. – Andy Turner Mar 02 '18 at 08:36
  • Your test is invalid. Try running the same code in both tests apart from the File/BufferedInputStream. You're comparing apples and oranges. – user207421 Mar 02 '18 at 08:38
  • See also https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java. – lexicore Mar 02 '18 at 08:41
  • 1
    Because `is.read()` reads **one** character while `is.read(buf)` reads **1024** bytes. Also `is.read()` is `synchronized` which might make a difference. – OldCurmudgeon Mar 02 '18 at 08:42
  • Actually ,the BufferedInputStream instance helped me reducing the IO operations,it also made `fileSize/1024` method invocations in the second case. – Cherokee Mar 02 '18 at 08:45
  • No. Actually it made `filesize` method invocations. No two ways about it. It performed `filesize/8192` *system calls,* which is another issue altogether. – user207421 Mar 02 '18 at 08:47
  • You didn't ask it but it can be interesting: By using elements of java.nio package instead of java.io you can achieve quite serious performance improvements. – pcjuzer Mar 02 '18 at 08:49
  • @pcjuzer Your evidence or authority for that statement? NIO is more *scalable.* Not necessarily faster at all. The biggest performance gain I have ever seen via NIO file handling is 20%, which is hardly 'quite serious'. – user207421 Mar 02 '18 at 08:54
  • @EJP I have experiences: I reimplemented a java.io solution to NIO and it became much faster (like 10x). It was about copying file content with GUI feedback. I guess it was faster because of the non-blocking nature of NIO. So yes, not the NIO is faster in itself, it just lets other code to run while doing I/O operations. – pcjuzer Mar 02 '18 at 09:17

1 Answers1

0

FileInputStream#read(byte b[]) will read the multiple bytes into b every call. In this case 1024

BufferedInputStream#read() will read one byte every call. Internally BufferedInputStream will use a buffer of size 1024 to copy data from the stream which it wraps, however, you are still performing far more operations than you have to.

Try using the BufferedInputStream#read(byte b[]) method and you will notice comparable speeds to that of FileInputStream.

Also as noted by OldCurmudgeon the BufferedInputStream#read method is synchronized:

public synchronized int read() throws IOException {
    if (pos >= count) {
        fill();
        if (pos >= count)
            return -1;
    }
    return getBufIfOpen()[pos++] & 0xff;
}

To show you an example of how much overhead this can be, I made a small demo:

public class Main {
    static final double TEST_SIZE = 100000000.0;
    static final double BILLION = 1000000000.0;

    public static void main(String[] args) {
        testStandard();
        testSync();
    }

    static void testStandard() {
        long startTime = System.nanoTime();
        for (int i =0; i < TEST_SIZE; i++) {
        }
        long endTime = System.nanoTime();
        System.out.println((endTime - startTime)/ BILLION  + " seconds");
    }

    static void testSync() {
        long startTime = System.nanoTime();
        for (int i =0; i < TEST_SIZE; i++) {
            synchronized (Main.class) {}
        }
        long endTime = System.nanoTime();
        System.out.println((endTime - startTime)/ BILLION  + " seconds");
    }
}

On my computer the synchronized calls took around 40 times longer to execute:

0.13086644 seconds
4.90248797 seconds
flakes
  • 21,558
  • 8
  • 41
  • 88