The java API of InputStream is what it is. Specifically, it has this method:
int read() throws IOException
which reads a single byte (it returns an int, so that it can return -1 to indicate EOF).
So, if you try to read a SINGLE BYTE from a file, it'll try to do that. In the case of a block device like a harddisk, that'll likely read the entire block, and then chuck everything except that one byte, so, if you call that read()
method 8192 times, it reads the same block, over and over, 8192 times, each time chucking away 8191 bytes and giving you just the one you want. Thus, reading 67 million bytes in the entire process. Ouch. Not very efficient.
Given that the kernel, CPU, disk, etc all read in a block size of 8192, there is zero performance difference between a BufferedInputStream(new FileInputStream)
and just the new FileInputStream
, IF you use something like:
byte[] buffer = new byte[8192];
in.read(buffer);
Now even plain jane unbuffered new FileInputStream
just ends up reading that block off of disk just once.
BufferedInputStream
does that 'under the hood' even if you use the single-byte form of read()
, and will then feed you data from that byte array for the next 8191 calls to read()
. That's all BufferedInputStream
does.
If you are using the read()
(one byte at a time) variant (or the byte-array variant of read, but with really small byte arrays), then BufferedInputStream
makes sense. Otherwise, that does nothing and there is no need to put that in there.
NB: As far as I know, java makes no guesses about what the disk buffer size is and just uses some reasonable buffer size. The effect is the same: If using single-byte-at-a-time, wrapping your filestream into a bufferedstream improves performance by a factor 1000+, if you are using the byte array variant, no difference whatsoever.