10

Reading some sources about Java file I/O managing, I get to know that there are more than 1 alternative for input and output operations.

These are:

  • BufferedReader and BufferedWriter
  • FileReader and FileWriter
  • FileInputStream and FileOutputStream
  • InputStreamReader and OutputStreamWriter
  • Scanner class

What of these is best alternative for text files managing? What's best alternative for serialization? What does Java NIO say about it?

Jonik
  • 80,077
  • 70
  • 264
  • 372
diegoaguilar
  • 8,179
  • 14
  • 80
  • 129
  • `nio` means simply "new I/O": https://en.wikipedia.org/wiki/New_I/O – Mark Whitaker Oct 17 '13 at 14:54
  • 2
    @MarkWhitaker that URL intrigues me. Is this page 1 level deeper than most wiki pages? – Cruncher Oct 17 '13 at 14:56
  • For the first question, BufferedReader and writer are better if you're using huge data. – Kakalokia Oct 17 '13 at 14:56
  • The Stream ones are for binary files. The Reader and Writer are for text. – user2793390 Oct 17 '13 at 14:56
  • @Cruncher Yep, that's interesting. I think that once DNS figures out the address to send the request to, the software that handles the request on the server can do anything it pleases with the rest of the URL, and perhaps their server just doesn't handle slashes in the usual way, because slashes in encyclopedia entries are too common. There's a page for "Boeing F/A-18E/F Super Hornet", e.g. – ajb Oct 17 '13 at 15:55
  • Well, slashes as "directory separator" in URLs are purely a convention: The server can apply it or not, however it wants. – Joachim Sauer Oct 17 '13 at 16:05
  • For the absolute simplest ways to do text file IO in Java, use a 3rd party library such as [Guava](https://code.google.com/p/guava-libraries/). See for example [this answer in a related question](http://stackoverflow.com/questions/4908989/what-are-the-best-simplest-classes-used-for-reading-files-in-java/4909025#4909025). – Jonik Dec 29 '13 at 12:36

1 Answers1

20

Two kinds of data

Generally speaking there are two "worlds":

  • binary data
  • text data

When it's a file (or a socket, or a BLOB in a DB, or ...), then it's always binary data first.

Some of that binary data can be treated as text data (which involves something called an "encoding" or "character encoding").

Binary Data

Whenever you want to handle the binary data then you need to use the InputStream/OutputStream classes (generally, everything that contains Stream in its name).

That's why there's a FileInputStream and a FileOutputStream: those read from and write to files and they handle binary data.

Text Data

Whenever you want to handle text data, then you need to use the Reader/Writer classes.

Whenever you need to convert binary data to text (or vice versa), then you need some kind of encoding (common ones are UTF-8, UTF-16, ISO-8859-1 (and related ones) and the good old US-ASCII). "Luckily" the Java platform also has something called the "default platform encoding" which it will use whenever it needs one but the code doesn't specify one.

The platform default encoding is a two-sided sword, however:

  • it makes writing code easier, because you don't have to specify an encoding for each operation but
  • it might not match the data you have: If the platform-default encoding is ISO-8859-1 and the file you read is actually UTF-8, then you will get a scrambled output!

For reading, we should also mention the BufferedReader which can be wrapped around any other Reader and adds the ability to handle whole lines at once.

Scanner is a special class that's meant to parse text input into tokens. It's most useful for structured text but often used on System.in to provide a very simple way to read data from stdin (i.e. from what the user inputs on the keyboard).

Bridgin the gap

Now, confusingly enough there are classes that make the bridge between those worlds, which generally have both parts in their names:

  • an InputStreamReader consumes a InputStream and is itself a Reader.
  • an OutputStreamWriter is a Writer and writes to an OutputStream.

And then there are "shortcut classes" that basically combine two other classes that are often combined.

  • a FileReader is basically a combination of a FileInputStream with an InputStreamReader
  • a FileWriter is basically a combination of a FileOutputStream with an OutputStreamWriter

Note that FileReader and FileWriter have a major drawback compared to their more complicated "hand-built" alternative: they always use the platform default encoding, which might not be what you're trying to do!

What about serialization?

ObjectOutputStream and ObjectInputStream are special streams used for serialization.

As the name of the classes implies serializing involves only binary data (even if serializing String objects), so you'll want to use *Stream classes exclusively. As long as you avoid any Reader/Writer classes, you should be fine.

Further resources

Joachim Sauer
  • 302,674
  • 57
  • 556
  • 614
  • Thanks, could you see my edits and tell me about the changes, please? – diegoaguilar Oct 17 '13 at 15:04
  • @Diego: I'm sorry but no, that changes the scope of the question quite dramatically and should *probably* be moved into a separate question. – Joachim Sauer Oct 17 '13 at 15:05
  • Well so far I accept you answer. You are right though. Just I want to take my serializing alternative as good. Thanks! – diegoaguilar Oct 17 '13 at 15:06
  • 1
    @Diego: I've taken the liberty of rolling back your edit, because this way the question is more likely to be useful for future visitors. But your implementation looks good (at least from the perspective of chosing the right I/O classes, I didn't do an in-depth review). – Joachim Sauer Oct 17 '13 at 15:15
  • 1
    Thanks @Joachim, I accept and agree with the point of rolling back the edit. – diegoaguilar Oct 17 '13 at 16:55