1

I am reading a bunch of binary files (one at a time) into memory to perform some operations on them and then saving them back to the disk. With small files, it works perfectly fine, however, with larger files there is quite a bit of concern that I have.

Now, assuming that the file I am reading is 25Mb large - this is what my code looks like:-

public static byte[] returnEncryptedFileData(File fileObj) {
byte[] fileData = FileUtils.readFileToByteArray(fileObj);
//now performing some operations on fileData

return fileData;
    }

Right after this code executes, I see (50Mb + MISC) of extra space consumption (which is fine because there would be 2 byte arrays - one is fileData as I've defined and another one used by readFileToByteArray to perform the operation, each holding 25Mb of data)

However, even after this method returns and is called again for the next file to be read, the memory held previously isn't released! If the next file being read is 30Mb large, I see a memory consumption of (50Mb + 60Mb + MISC)

How do I cleanup after reading the file to a byte array, performing some operations on it and then return it from a method. System.gc() doesn't help as it does not execute the GC right away.. no way that I believe exists to "deallocate" memory?

What am I doing wrong here?

Amol Gupta
  • 13
  • 4

5 Answers5

3

The short answer: Java will get to it when it gets to it. Do not use System.gc();

Most people have enough memory these days that 50mb is not really a big deal. If you end up having to do this operation a lot of times, the best thing to do is reuse your big byte arrays so you only ever have one. Another option would be to only read the files in a small amount at a time, do your processing, then read more. This may not be practical for whatever the processing is, though.

Community
  • 1
  • 1
durron597
  • 31,968
  • 17
  • 99
  • 158
  • Can you explain why not to call `System.gc()`? According to its Javadoc, it's not forbidden nor discouraged to do so. – mthmulders Apr 08 '13 at 12:00
  • @mthmulders I added a link that answers your question – durron597 Apr 08 '13 at 12:01
  • Thanks! Interesting read. I think there *may* be situations where you need it, but the fact you're considering to use it means you'll have to double-check whether you're really doing it right. Use with care, so to say. – mthmulders Apr 08 '13 at 12:06
2

As stated before, you cannot force the JVM to garbage collect your memory, or to free a certain part of the memory.

You can however make it more likely that your memory will be freed. To understand how, you must understand how the garbage collector (GC) works. In short, it will free memory when it isn't referenced anywhere. In other words, when no object holds a reference to an object A, object A will be eligible for garbage collection. See the Java tutorial for a short introduction on the topic.

So, you can increase changes your memory is released by explicitly releasing all references to your byte[]. A subsequent call to System.gc() "suggests that the Java Virtual Machine expend effort toward recycling unused objects in order to make the memory they currently occupy available for quick reuse". Note that this is no guarantee it will actually have freed your memory!

mthmulders
  • 9,483
  • 4
  • 37
  • 54
0

Garbage collection in Java is done whenever the JVM believes it needs to (this is a very simplified explanation :) ). If you don't get an Error or Exception or anything like that, you are fine. If you are concerned about the memory footprint of your application check the memory arguments for the JVM: e.g.: How can I increase the JVM memory?

Community
  • 1
  • 1
rich
  • 403
  • 3
  • 9
0

I guess you still have some reference to the returning byte array from this method. Until you does not have reference to it, GC will not pick that. Can you publish the way you call this method too and what will happen after the call.

0

The only things that aren't automatically un-allocated by GC are resources external to the VM.
In your case as the readFileToByteArray method always close the file, the memory still allocated is still referenced or not yet garbage-collected

The way to fix it depends of how you declare the variables you need to un-allocate. I'd advice to use a new reference of your byte array each time you read a file and to declare it with the smallest possible scope (inside the for loop if you have one) so the variable will be allocated in the young generation and un-allocated as soon as possible. Otherwise explicitly set your references to null before re-affecting its.

Gab
  • 7,869
  • 4
  • 37
  • 68
  • Thanks for the tip, but setting it to null will only set the pointer to the data held to null and won't clean up the data itself. Though it would be a flag for the GC to recollect that space, but that's going to happen only when the GC "is scheduled to run the next time". – Amol Gupta Apr 08 '13 at 12:10
  • again it depends on the generation area in which your variables are allocated. memory is not only un-allocated at full-gc invocation. See http://stackoverflow.com/questions/2070791/young-tenured-and-perm-generation – Gab Apr 08 '13 at 12:21
  • 1
    @AmolGupta The GC is not on a fixed schedule as your comment implies. GC will run based on various triggers. For instance, when used heap space reaches X percent of allocated heap, the JVM will run a GC (as opposed to a malloc). If there is not enough contiguous heap space to allocate an object, the JVM will run a GC.... and so on and such forth. You need only be concerned if you get an `OutOfMemoryError`, because that means the JVM tried everything but couldn't free or allocate enough memory. – Tim Bender Apr 09 '13 at 07:29