3

I've got the following code which is designed to read a directory and compress it into a tar.gz archive. When I deploy the code onto the server and test it with a batch of files, it works on the first few test batches, but after the 4th or 5th batch, it starts consistently giving me java.lang.OutOfMemoryError: Direct buffer memory even though the file batch size stays the same and the heap space looks fine. Here's the code :

public static void compressDirectory(String archiveDirectoryToCompress) throws IOException {
Path archiveToCompress = Files.createFile(Paths.get(archiveDirectoryToCompress + ".tar.gz"));

try (GzipCompressorOutputStream gzipCompressorOutputStream = new GzipCompressorOutputStream(
    Files.newOutputStream(archiveToCompress));
     TarArchiveOutputStream tarArchiveOutputStream = new TarArchiveOutputStream(gzipCompressorOutputStream)) {
  Path directory = Paths.get(archiveDirectoryToCompress);
  Files.walk(directory)
      .filter(path -> !Files.isDirectory(path))
      .forEach(path -> {
        String
            stringPath =
            path.toAbsolutePath().toString().replace(directory.toAbsolutePath().toString(), "")
                .replace(path.getFileName().toString(), "");
        TarArchiveEntry tarEntry = new TarArchiveEntry(stringPath + "/" + path.getFileName().toString());
        try {
          byte[] bytes = Files.readAllBytes(path); //It throws the error at this point.
          tarEntry.setSize(bytes.length);
          tarArchiveOutputStream.putArchiveEntry(tarEntry);
          tarArchiveOutputStream.write(bytes);
          tarArchiveOutputStream.closeArchiveEntry();
        } catch (Exception e) {
          LOGGER.error("There was an error while compressing the files", e);
        }
      });
}

}

And here's the exception:

Caused by: java.lang.OutOfMemoryError: Direct buffer memory
    at java.nio.Bits.reserveMemory(Bits.java:658)
    at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
    at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
    at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174)
    at sun.nio.ch.IOUtil.read(IOUtil.java:195)
    at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:158)
    at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:65)
    at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)
    at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
    at java.nio.file.Files.read(Files.java:3105)
    at java.nio.file.Files.readAllBytes(Files.java:3158)
    at com.ubs.gfs.etd.reporting.otc.trsloader.service.file.GmiEodFileArchiverService.lambda$compressDirectory$4(GmiEodFileArchiverService.java:124)
    at com.ubs.gfs.etd.reporting.otc.trsloader.service.file.GmiEodFileArchiverService$$Lambda$19/183444013.accept(Unknown Source)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
    at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
    at java.util.Iterator.forEachRemaining(Iterator.java:116)
    at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:512)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:502)
    at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
    at com.ubs.gfs.etd.reporting.otc.trsloader.service.file.GmiEodFileArchiverService.compressDirectory(GmiEodFileArchiverService.java:117)
    at com.ubs.gfs.etd.reporting.otc.trsloader.service.file.GmiEodFileArchiverService.archiveFiles(GmiEodFileArchiverService.java:66)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.springframework.expression.spel.support.ReflectiveMethodExecutor.execute(ReflectiveMethodExecutor.java:113)
    at org.springframework.expression.spel.ast.MethodReference.getValueInternal(MethodReference.java:102)
    at org.springframework.expression.spel.ast.MethodReference.access$000(MethodReference.java:49)
    at org.springframework.expression.spel.ast.MethodReference$MethodValueRef.getValue(MethodReference.java:347)
    at org.springframework.expression.spel.ast.CompoundExpression.getValueInternal(CompoundExpression.java:88)
    at org.springframework.expression.spel.ast.SpelNodeImpl.getTypedValue(SpelNodeImpl.java:131)
    at org.springframework.expression.spel.standard.SpelExpression.getValue(SpelExpression.java:330)
    at org.springframework.integration.util.AbstractExpressionEvaluator.evaluateExpression(AbstractExpressionEvaluator.java:166)
    at org.springframework.integration.util.MessagingMethodInvokerHelper.processInternal(MessagingMethodInvokerHelper.java:317)
    ... 93 more

I think there's a buffer memory leak as it works flawlessly on the first 4 test batches but then consistently gives a java.lang.OutOfMemoryError: Direct buffer memory error after but I have no clue how to fix it. I saw a potential solution using a Cleaner method here: http://www.java67.com/2014/01/how-to-fix-javalangoufofmemoryerror-direct-byte-buffer-java.html

But I don't know if that could apply in this case.

------------------------EDIT------------------------

I found another approach on how to tar the files using IOUtils and buffered input streams and that fixed the problem, updated code:

  public static void compressDirectory(String archiveDirectoryToCompress) throws IOException {
Path archiveToCompress = Files.createFile(Paths.get(archiveDirectoryToCompress + ".tar.gz"));

try (GzipCompressorOutputStream gzipCompressorOutputStream = new GzipCompressorOutputStream(
    Files.newOutputStream(archiveToCompress));
     TarArchiveOutputStream tarArchiveOutputStream = new TarArchiveOutputStream(gzipCompressorOutputStream)) {
  Path directory = Paths.get(archiveDirectoryToCompress);
  Files.walk(directory)
      .filter(path -> !Files.isDirectory(path))
      .forEach(path -> {
        TarArchiveEntry tarEntry = new TarArchiveEntry(path.toFile(),path.getFileName().toString());
        try (BufferedInputStream bufferedInputStream = new BufferedInputStream(new FileInputStream(path.toString()))) {
          tarArchiveOutputStream.putArchiveEntry(tarEntry);
          IOUtils.copy(bufferedInputStream, tarArchiveOutputStream);
          tarArchiveOutputStream.closeArchiveEntry();
        } catch (Exception e) {
          LOGGER.error("There was an error while compressing the files", e);
        }
      });
}

}

ISHIRO
  • 43
  • 1
  • 5
  • How much heap size did you allocate for your application? Also its bad decision to read all bytes from file into memory, better use InputStream instead. – eg04lt3r Oct 11 '16 at 18:03
  • I allocate 512mb of heapspace. When I monitor jconsole, I realise that heap memory used doesn't seem to change much when compression code is called as I monitor the log files and jconsole at the same time. – ISHIRO Oct 12 '16 at 10:06
  • Please use my approach, because IOUtils.copy does not close streams after copy done. In this case you can face another exception when try compress a lot of files (~ 1000 files). Or you can close them in finally block. In your example wrapping in BufferedInputStream is unnecessary according to IOUtils.copy documentation, it has buffering internally. – eg04lt3r Oct 13 '16 at 17:34

3 Answers3

3

When loading files into memory, java allocates a series of DirectByteBuffers using a different, non-heap pool called the direct memory pool. These buffers also have a Deallocator class attached to them that are responsible for freeing that memory when the file is no longer needed. By default, those Deallocators run during garbage collection.

What I suspect is happening (and which is something I've actually seen before) is that your program is not making much use of the heap, and garbage collection is not running often enough to free those DirectByteBuffers. So you can try one of two things: either increase the size of the direct memory pool using -XX:MaxDirectMemorySize or periodically force garbage collection by calling System.gc().

afreitas
  • 53
  • 4
1

Actually you can get file size just calling file.length(). Try change the way you are reading bytes from file:

tarArchiveOutputStream.write(IOUtils.toByteArray(new FileInputStream(path.toFile())));

Class IOUtils from apache commons IO package (http://commons.apache.org/proper/commons-io/). I think it should help resolve your trouble. In some cases suggestion of @afretas is useful.

eg04lt3r
  • 2,467
  • 14
  • 19
0

Also make sure that -XX:+DisableExplicitGC is not used to avoid facing https://bugs.openjdk.java.net/browse/JDK-8142537.

A more detailed description of the cause of the issue when using -XX:+DisableExplicitGC is available at https://issues.apache.org/jira/browse/KAFKA-5470:

This is important because Bits.reserveMemory calls System.gc() hoping to free native memory in order to avoid throwing an OutOfMemoryException. This call is currently a no-op due to -XX:+DisableExplicitGC.

anre
  • 3,617
  • 26
  • 33