4

I am working on I/O classes in Java. I understand that there are two important type of streams: byte stream and character stream. But... I have tried to read and write text file with byte stream and it worked. Here is the code:

    File klasor = new File("C:\\Java");
    if(!klasor.exists()) klasor.mkdirs();

    File kaynakDosya = new File("C:\\Java\\kaynak.txt");
    if(!kaynakDosya.exists()) kaynakDosya.createNewFile();

    File hedefDosya = new File("C:\\Java\\hedef.txt");
    if(!hedefDosya.exists()) hedefDosya.createNewFile();

    FileInputStream kaynak = new FileInputStream(kaynakDosya);
    FileOutputStream hedef = new FileOutputStream(hedefDosya);

    int c;
    while((c = kaynak.read()) != -1) {
        hedef.write(c);
    }

    if(kaynak != null) {
        kaynak.close();
    }

    if(hedef != null) {
        hedef.close();
    }

And then I did the same with character stream:

    File klasor = new File("C:\\Java");
    if(!klasor.exists()) klasor.mkdirs();

    File kaynakDosya = new File("C:\\Java\\kaynak.txt");
    if(!kaynakDosya.exists()) kaynakDosya.createNewFile();

    File hedefDosya = new File("C:\\Java\\hedef.txt");
    if(!hedefDosya.exists()) hedefDosya.createNewFile();

    FileReader kaynak = new FileReader(kaynakDosya);
    FileWriter hedef = new FileWriter(hedefDosya);

    int c;
    while((c = kaynak.read()) != -1) {
        hedef.write(c);
    }

    if(kaynak != null) {
        kaynak.close();
    }

    if(hedef != null) {
        hedef.close();
    }

These two produced the same result. So, I want to know, why shouldn't I use byte stream here but character stream? (I have read some articles as well as related questions here on stackoverflow and they say so) I know that character stream will read it character by character, but what advantage does this give me? Or what problems could occur if I read characters using byte stream? I hope my question is clear. I would appreciate real-case examples.

2 Answers2

3

Writing characters to a byte-oriented output stream (or reading characters from a byte-oriented input stream) will produce the same results as using character-oriented streams only if the all the characters in the stream can be represented by single bytes in the default encoding of your platform (usually UTF-8, but it could be something else). To test this, try a file that contains something that requires more than one byte to represent (such as Greek, Cyrillic, or Arabic characters). With a byte-oriented stream, these won't work. With a character-oriented stream, the characters will be preserved as long as both the streams are using encodings that supports those characters (such as UTF-8) and the input file was stored in the encoding used for the input stream.

Note that your byte-oriented code isn't actually testing this, since it's just copying a file byte-for-byte. Everything will look like it's working, but if you tried to read the actual characters (say, to compare them to actual text in code, it would fail. To test this, create a file (in, say, UTF-8 encoding) containing the Cyrillic text "Привет!". Then in code, try reading that text using a byte-oriented input stream into a String and testing if it actually contains what you expect using

System.out.println("Success: " + "Привет!".equals(input));
Ted Hopp
  • 232,168
  • 48
  • 399
  • 521
  • 1
    I am confused: aren't all the characters represented by two bytes? What do you mean by "if the all the characters in the file can be represented by single bytes"? I have tried it with Cyrillic "Привет!" and it has worked, too! Plus, I have kept encoding of the source file "UTF-8" and changed encoding of the target file to "ANSI", it worked anyway. –  Oct 22 '17 at 14:08
  • 1
    @AdemTepe - In UTF-8, code points up to 0x7F are represented by a single byte. (See [this thread](https://stackoverflow.com/questions/7136421/why-does-utf-8-use-more-than-one-byte-to-represent-some-characters), for instance.) Your byte-oriented code works fine for just copying a file byte-for-byte, but this doesn't address what would happen if you tried to interpret those bytes as characters (on input) or if you tried to write characters to a byte-oriented stream. I'll update my answer to clarify this point. – Ted Hopp Oct 22 '17 at 15:24
2

java.io.FileInputStream javadoc states :

FileInputStream is meant for reading streams of raw bytes such as image data. For reading streams of characters, consider using FileReader.

java.io.FileOutputStream javadoc states something similar enough :

FileOutputStream is meant for writing streams of raw bytes such as image data. For writing streams of characters, consider using FileWriter.

One of main differences between FileInputStream/FileOutputStream and FileReader/FileWriter is that the first provides methods to manipulate bytes while the latter provides methods to manipulate characters.

In your example, as you copy a file content into another file, manipulating char or byte doesn't make a big difference.
In your case, a FileInputStream or a BufferedInputStream seems even more appropriate.

But if you use a stream to read/write characters from/into String instances, using FileReader/FileWriter eases really the things and make things clearer.
Besides, you could also wrap FileReader/FileWriter into a BufferedReader/BufferedWriter and benefit from efficient reading/writting of characters, arrays, and lines.

 BufferedWriter writer = new BufferedWriter(new FileWriter("myfile"));
 writer.append(oneString);
 writer.append(oneStringBuffer);
 writer.newLine();

 BufferedReader reader = new BufferedReader(new FileReader("myfile"));
 String currentLine = reader.readLine();
davidxxx
  • 125,838
  • 23
  • 214
  • 215