7

As you know we have several tools in java for writing data into streams.
In this sample code I have compared them by runtime.
Can somebody explain it exactly? Thanks.
Here is the code:

import java.io.FileOutputStream;
import java.io.OutputStreamWriter;
import java.io.PrintStream;
import java.io.PrintWriter;

public class IOtests
{

public static void main(String[] args) throws Exception
{
    char[] chars = new char[100];
    byte[] bytes = new byte[100];
    for (int i = 0; i < 100; i++)
    {
        chars[i] = (char) i;
        bytes[i] = (byte) i;
    }
    OutputStreamWriter out = new OutputStreamWriter(new FileOutputStream(
            "output.txt"));
    long a = System.currentTimeMillis();
    for (int i = 0; i < 100000; i++)
        for (char j : chars)
            out.write(j);
    System.out.println("OutputStreamWriter writing characters: "
            + (System.currentTimeMillis() - a));
    out = new OutputStreamWriter(new FileOutputStream("output.txt"));
    a = System.currentTimeMillis();
    for (int i = 0; i < 100000; i++)
        for (byte j : bytes)
            out.write(j);
    System.out.println("OutputStreamWriter writing bytes: "
            + (System.currentTimeMillis() - a));
    PrintStream out1 = new PrintStream("output.txt");
    a = System.currentTimeMillis();
    for (int i = 0; i < 100000; i++)
        for (char j : chars)
            out1.write(j);
    System.out.println("PrintStream writing characters: "
            + (System.currentTimeMillis() - a));
    out1 = new PrintStream("output.txt");
    a = System.currentTimeMillis();
    for (int i = 0; i < 100000; i++)
        for (byte j : bytes)
            out1.write(j);
    System.out.println("PrintStream writing bytes: "
            + (System.currentTimeMillis() - a));
    PrintWriter out2 = new PrintWriter("output.txt");
    a = System.currentTimeMillis();
    for (int i = 0; i < 100000; i++)
        for (char j : chars)
            out2.write(j);
    System.out.println("PrintWriter writing characters: "
            + (System.currentTimeMillis() - a));
    out1 = new PrintStream("output.txt");
    a = System.currentTimeMillis();
    for (int i = 0; i < 100000; i++)
        for (byte j : bytes)
            out2.write(j);
    System.out.println("PrintWriter writing bytes: "
            + (System.currentTimeMillis() - a));
}

}

Results:

OutputStreamWriter writing characters: 4141
OutputStreamWriter writing bytes: 3546
PrintStream writing characters: 86516
PrintStream writing bytes: 70484
PrintWriter writing characters: 938
PrintWriter writing bytes: 2484

Note that all times are in milliseconds.

g00glen00b
  • 41,995
  • 13
  • 95
  • 133
Alireza Mohamadi
  • 511
  • 5
  • 19
  • 1
    Given that you never *close* the output, it could all be buffered. Additionally, you're not giving any JIT warmup, not performing any garbage collection etc. Oh, and "PrintWriter writing bytes" is a misnomer given that it only writes *characters*. You've just got an implicit byte to int conversion. Besides, writing a single byte or character at a time is unrealistic in most sensible code - you'd use the overloads taking `byte[]`, `char[]` or `String`. – Jon Skeet Aug 25 '13 at 16:37
  • 1
    Your timing numbers are highly suspect. See [this thread](http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java) for info on how to write a correct benchmark in Java. – Ted Hopp Aug 25 '13 at 16:38
  • 1
    @TedHopp With all that advice taken, writing to a file on disk is itself an unpredictable process. If on a *nix, one could use `/dev/null` as a relatively predictable sink; on Windows, god knows what. – Marko Topolnik Aug 25 '13 at 16:41
  • Answer to comments. System that I am running code on it is very old and that's the reason for high numbers. And I just want to compare and don't want to be so technical about precise benchmarking. Run this code on your system. and you will see proportions are almost same after several runs. – Alireza Mohamadi Aug 25 '13 at 16:51
  • 1
    @MarkoTopolnik - True. OP could use an in-memory output sink, or (better) a [`NullOutputStream`](http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/output/NullOutputStream.html) from Apache Commons. – Ted Hopp Aug 25 '13 at 16:55

1 Answers1

5

I've reduced your question to its essence:

public class Test {
  static byte[] bytes = new byte[10_000_000];
  static {
    for (int i = 0; i < bytes.length; i++) bytes[i] = (byte) (i%100+32);
  }
  public static void main(String[] args) throws Exception {
    writer(true);
    writer(false);
    stream(true);
    stream(false);
  }

  static void writer(boolean flush) throws IOException {
    Writer out = new FileWriter("output.txt");
    long a = System.currentTimeMillis();
    for (byte j : bytes) {
      out.write(j);
      if (flush) out.flush();
    }
    out.close();
    System.out.println("FileWriter with" + (flush? "":"out") + " flushing: " +
        (System.currentTimeMillis() - a));
  }
  static void stream(boolean flush) throws IOException {
    OutputStream out = new FileOutputStream("output.txt");
    long a = System.currentTimeMillis();
    for (byte j : bytes) {
      out.write(j);
      if (flush) out.flush();
    }
    out.close();
    System.out.println("FileOutputStream with" + (flush? "":"out") + " flushing: " +
        (System.currentTimeMillis() - a));
  }
}

Notes:

  • properly closing the resources when done;
  • double loop replaced by single loop, but a larger array;
  • avoid writing control characters to evade autoflush behavior;
  • only using byte array since you are testing only one method in all cases: write(int). Therefore it makes no difference whether you are using bytes or chars;
  • removed everything except a FileWriter and a FileOutputStream because all other cases boil down to these two;
  • testing both writer and output stream in two modes: flush after each write, and don't flush at all until close.

Now, when you run this, you'll get output like the following:

FileWriter with flushing: 28235
FileWriter without flushing: 828
FileOutputStream with flushing: 23984
FileOutputStream without flushing: 23641

So, what's the lesson?

  • all writers are buffered because internally they delegate to StreamEncoder which is itself buffered;
  • FileOutputStream is not buffered;
  • non-buffered writing byte-by-byte is very slow.

Good practices demand that you always do buffered writing: either using buffered sinks, or maintaining an explicit buffer on your side.

Marko Topolnik
  • 195,646
  • 29
  • 319
  • 436
  • Really I can't understand differences between a PrintWriter, a PrintStream and an OutputStreamWriter, and using a file was just an example cause I just needed to write so many chracters/bytes to compare operation rates and a file as a sink for a stream was so appropriate. So without considering this fact that writing to a file without using a buffer is slow, assume that I'm writing to a terminal. Now which tool I must use? – Alireza Mohamadi Aug 26 '13 at 07:09
  • It doesn't matter what the sink is: the unbuffered I/O is slow because it involves one system call per byte. So these issues don't apply only when you are writing to some "virtual" stream, which doesn't involve any actions outside the JVM itself. For example, `StringWriter` or `ByteArrayOutputStream`. – Marko Topolnik Aug 26 '13 at 08:31
  • As for `PrintWriter` vs. `PrintStream`, the latter is a legacy class. You should always prefer a `Writer` when writing character data, and an `OutputStream` when writing binary data. – Marko Topolnik Aug 26 '13 at 08:32