Why are there different methods of dealing with File I/O in java?

Question

So far I've been using Scanners to read data from text files. Example:

        File file = new File("path\\to\\file") ;

        Scanner scan = new Scanner(file) ;  

        System.out.println(scan.nextLine()) ;

And used FileWriters to write data into text files. Like this:

        try
        {
            FileWriter writer = new FileWriter("Foo.txt") ;
            writer.write("hello there!") ;                      
            writer.close() 
        }
        catch(IOException ex) 
        {
            ex.printStackTrace() ;
        }

A few days ago I was at a meeting with my instructor. When I was examining his code, I noticed that he had used a BufferedReader and BufferedWriter - a method of reading and writing to files that I have not used before. Then I asked him what's the difference between using a BufferedReader and a Scanner to read data from a file. He could not explain it to me.

So I did a bit of research and found out that the classes that lie under the hood to perform these operations are the InputStream and OutputStream. These classes have their respective subclasses like FileInputStream, FileOutputStream, etc.

Further into my research I came across the Reader and Writer classes which are used to read data from and write data to files. Again, like the InputStream and OutputStream, these classes are abstract super classes and have their own subclasses to perform read and write operations.

I'm not confused about this but...why? I mean, why are there different methods of doing the same thing? What's the significance? And which method is the most efficient way of dealing with file inputs and outputs?

*Then I asked him what's the difference of using a BufferedReader than a Scanner* The short answer: about ten years. `BufferedReader` has been there since virtually year zero. `Scanner` really allows greater control of reading input, including pattern-matching, text-based type inference and so on. It has parsing functionality. `Reader`s read characters (as opposed to raw bytes) and `BufferedReader` can read lines of text (in addition to buffering input) — g00se, Aug 12 '21 at 09:22
@deHaar: that statement alone is not useful on such a question. First, because nio is not always the correct tool and second because it doesn't help someone who's trying to understand the overall layout of `java.io`. — Joachim Sauer, Aug 12 '21 at 09:36
@deHaar Not to mention that even if OP used NIO, they will likely still end up using `java.io.BufferedReader` (e.g. `Files#newBufferedReader(...)`) and/or other `java.io` classes. — Slaw, Aug 12 '21 at 10:17

Andy Turner · Answer 1 · 2021-08-12T10:07:20.007

Readers read chars; InputStreams read bytes. (Correspondingly, Writers write chars; OutputStreams write bytes).

Strings are sequences of chars, not bytes.

If you want to read bytes, you use an InputStream. If you're going to be reading them from a file, you use a FileInputStream; but not all sequences of bytes come from files, for example, ByteArrayInputStream allows you to read a sequence of bytes from a byte[]; but because it's an InputStream, it can be used in exactly the same way as if it came from a file.

If you want to read chars, you use a Reader. If you want to read an InputStream as chars, you use an InputStreamReader, for which you specify a CharSet, which allows the bytes to be converted to chars correctly.

A BufferedReader is a Reader that buffers its input - it reads many bytes from the source at once, rather than one at a time. It is generally more efficient to read lots at once rather than just one, assuming you're going to need more than one. It also provides convenience methods to get a String instead of a char[].

To me, the canonical example of what BufferedReader allow you to do over and above a Reader is to read a whole line at once.

Scanner is a high-level class which allows you to read data from a String (or, generally, a Readable, which is implemented by Reader and BufferedReader, for example), but getting that data as a type other than String - for example, you could read 1 2.0 true as an int, double and boolean respectively, without having to do the parsing yourself.

This is just building on top of the functionality of other things, basically by reading internally from a Reader. Scanner is basically a tokenizer, although there's also the much older StringTokenizer.

Scanner is a pretty poor class, honestly: it's used frequently in basic programs (e.g. "Enter your name, enter your favorite color"); but it has sharp edges that catch many beginners out (e.g. Scanner is skipping nextLine() after using next() or nextFoo()?); and it's not-as-easy-as-you'd-think to do things like validation of user entering a number instead of a general string.

I find that you very quickly move on from using Scanner, and just use (Buffered)Reader to read everything as Strings: it's just more powerful.

Great answer, I always fail to explain why I discourage the use of `Scanner`, but this is a great summary. — Joachim Sauer, Aug 12 '21 at 09:35

Davide Lorenzo MARINO · Answer 2 · 2021-08-12T09:29:16.873

Java is full of methods and classes partially overlapping.

Generally speaking, there is no solution that is the better in any situation. Having more alternatives gives you the ability to choose the right solution for the particular situation that you need to solve.

As an example java.util.List is an interface with many implementing classes:

ArrayList
LinkedList
Vector
AttributeList
CopyOnWriteArrayList
Stack

any of them solve the same problem focusing on a particular aspect of that problem. Just as an idea ArrayList is a List implemented internally using arrays, while a LinkedList is list where each item has a link to previous and next element. If you need to insert / delete elements in the middle of a very big list it is better to use a LinkedList, if you need to read in the middle of the list it is better an ArrayList.

In your particular case here is the difference:

Scanner:

A simple text scanner which can parse primitive types and strings using regular expressions. A Scanner breaks its input into tokens using a delimiter pattern, which by default matches whitespace. The resulting tokens may then be converted into values of different types using the various next methods.

It will be useful if you need to split the tokens while reading

BufferedReader:

Reads text from a character-input stream, buffering characters so as to provide for the efficient reading of characters, arrays, and lines.

It is useful to speed up read operations because it uses a sort of cache (buffer) that holds bytes instead of read them 1 by 1

InputStream:

An InputStreamReader is a bridge from byte streams to character streams: It reads bytes and decodes them into characters using a specified charset. The charset that it uses may be specified by name or may be given explicitly, or the platform's default charset may be accepted. Each invocation of one of an InputStreamReader's read() methods may cause one or more bytes to be read from the underlying byte-input stream. To enable the efficient conversion of bytes to characters, more bytes may be read ahead from the underlying stream than are necessary to satisfy the current read operation. For top efficiency, consider wrapping an InputStreamReader within a BufferedReader.

It works at lower level than Scanner and for efficiency should be used together with a BufferedReader.

"If you need to insert / delete elements in the middle of a very big list it is better to use a LinkedList" not necessarily. Inserting/deleting in a `java.util.LinkedList` is quite inefficient because it has to walk the list in order to find the insertion point: that walking can be more expensive than shifting a block of memory in an `ArrayList`, because of the poor locality of reference between the nodes of the list. — Andy Turner, Aug 12 '21 at 09:42
@AndyTurner not necessarily you are right, but if you are looping through the elements to decide if remove them or not it is better a LinkedList than an ArrayList — Davide Lorenzo MARINO, Aug 12 '21 at 09:49
You only get the benefits from a `LinkedList` if you're using a (List)Iterator. That allows you to remove (and set, with ListIterator) efficiently; but not add. _Maybe_ from clearing a sublist, I can't tell by skimming the code. — Andy Turner, Aug 12 '21 at 09:56

Why are there different methods of dealing with File I/O in java?

2 Answers2