1

I would like to compare Guava and Apache commons IO to copy an InputStream to an OutputStream:

I compared the speed of (2 Threads starting at same time, with CyclicBarrier) :

  • org.apache.commons.io.IOUtils.copy
  • com.google.common.io.ByteStreams.copy

On a 4Go size file, the result is exactly the same, and here my question, I don't understand how is that possible except if the code behind is exactly the same or other reasons, but to be honest I think something wrong with my Test class ?

Result

all threads started
commonsIo perf:PT6M21.281S
Guava perf:PT6M21.282S

Code:

public class Test {
    public static void main(String... a) throws InterruptedException, BrokenBarrierException {
        final CyclicBarrier gate = new CyclicBarrier(3);

        Thread t1 = new Thread() {
            public void run() {
                try {
                    gate.await();
                    Instant start = Instant.now();
                    IOUtils.copy(Files.newInputStream(Paths.get("C:\\Users\\emil.brigand\\Downloads\\CentOS-6.4-x86_64-bin-DVD1.iso")), Files.newOutputStream(Paths.get("C:\\Users\\emil.brigand\\Downloads\\CentOS-6.4-x86_64-bin-DVD12.iso"), StandardOpenOption.CREATE_NEW, StandardOpenOption.DELETE_ON_CLOSE));
                    Instant end = Instant.now();
                    System.out.println("commonsIo perf:" + Duration.between(start, end));
                } catch (InterruptedException e) {
                    e.printStackTrace();
                } catch (BrokenBarrierException e) {
                    e.printStackTrace();
                } catch (IOException e) {
                    e.printStackTrace();
                }

            }
        };
        Thread t2 = new Thread() {
            public void run() {
                try {
                    gate.await();
                    Instant start = Instant.now();
                    ByteStreams.copy(Files.newInputStream(Paths.get("C:\\Users\\emil.brigand\\Downloads\\CentOS-6.4-x86_64-bin-DVD1.iso")), Files.newOutputStream(Paths.get("C:\\Users\\emil.brigand\\Downloads\\CentOS-6.4-x86_64-bin-DVD11.iso"), StandardOpenOption.CREATE_NEW, StandardOpenOption.DELETE_ON_CLOSE));
                    Instant end = Instant.now();
                    System.out.println("Guava perf:" + Duration.between(start, end));
                } catch (InterruptedException e) {
                    e.printStackTrace();
                } catch (BrokenBarrierException e) {
                    e.printStackTrace();
                } catch (IOException e) {
                    e.printStackTrace();
                }

            }
        };
        t1.start();
        t2.start();
        gate.await();
        System.out.println("all threads started");
    }
}
Emilien Brigand
  • 9,943
  • 8
  • 32
  • 37

1 Answers1

4

Your benchmark is not very elaborated, you may consider the information found in “How do I write a correct micro-benchmark in Java?” for details, but in this case, it doesn’t matter.

When copying files that large, there is indeed no relevant performance difference as the I/O speed will outweigh it all. Further, the fact that you resort to InputStreams for a file copying operation is equally inefficient in both cases.

It implies that data has to be copied from the I/O buffers into Java byte arrays and back. Since Java 1.4, there is an alternative, FileChannel.transferTo which tells the underlying system to transfer bytes directly without copying to Java byte arrays.

But since you are using Path and Files, you are using at least Java 7 and thus can simply use Files.copy(Path,Path,…) without the need to use any 3rd party library and without the detour of the InputStream/OutputStream API. Of course, it will utilize the direct copying capability of the NIO API.

But it’s possible that even using this API, the elapsed time doesn’t change as, as said, the I/O speed outweighs it all. You can’t make hard drives, etc. faster by rearranging the program code. And compared to the CPU speed, this hardware is awfully slow, allowing the CPU to compensate a lot of inefficiencies in the software, if there are any.

Still, Files.copy is simpler to use and eliminates the dependencies to 3rd party libraries, so there is no point in comparing the performance of these 3rd party libraries…

Community
  • 1
  • 1
Holger
  • 285,553
  • 42
  • 434
  • 765
  • I forgot the method Files.copy, so it was a bad example to compare those third libraries with files, it's just because we are using a lot IOUtils.copy but not for File InputStream or OutputStream, for others InputStream and OutputStream, and I recently cam across Guava lib and was interested to see the performance of this one... I will look at your link for benchmark... – Emilien Brigand Sep 10 '15 at 10:26
  • 1
    Still, if you have streams, you better [create channels for the stream](http://docs.oracle.com/javase/8/docs/api/java/nio/channels/Channels.html#newChannel-java.io.InputStream-) rather than the other way round. So you may still benefit if at least either is a file. Also note [`Files.copy(InputStream, Path)`](http://docs.oracle.com/javase/8/docs/api/java/nio/file/Files.html#copy-java.io.InputStream-java.nio.file.Path-java.nio.file.CopyOption...-) and [`Files.copy(Path,OutputStream)`](http://docs.oracle.com/javase/8/docs/api/java/nio/file/Files.html#copy-java.nio.file.Path-java.io.OutputStream-) – Holger Sep 10 '15 at 10:43
  • Ok, i did not know for channel, I will look at that! – Emilien Brigand Sep 10 '15 at 11:05