2

I have a serialized object on disk of a patricia trie(https://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/trie/PatriciaTrie.html). On disk, it occupies roughly 7.4 GB. I am using a 64 GB RAM server. When deserialized, the memory consumption of the corresponding process goes up till 40 GB. Is this sensible, because the highest voted answer at Serialized object size vs in memory object size in Java says that "the size in memory will be usually between half and double the serializable size!" I was expecting the in memory size to not go beyond 15 GB, but 40 GB is too much as other processes would be loaded as well.

I thought of using http://docs.oracle.com/javase/7/docs/api/java/lang/instrument/Instrumentation.html for measuring size in memory, but Calculate size of Object in Java says that it "can be used to get the implementation specific approximation of object size." So, it would be again approximate measure only.

Is there something I am missing here. I am closing the file and bufferred reader as well. What could be hogging all the memory? I can't share the code for corporate policy reasons - any help or pointers would be highly appreciated. Thanks

Community
  • 1
  • 1
  • 1
    What's your XMX set to? – Joeri Hendrickx Aug 02 '16 at 11:06
  • The size usually depends on what type of object it is. Memory object includes properties/fields inherited from the base class, which are not serialized. – user5226582 Aug 02 '16 at 11:12
  • @JoeriHendrickx Xmx is set to 60G. The running command starts with java -Xmx60g -d64 ... – pandeconscious Aug 02 '16 at 13:46
  • That makes no sense. You say 40gb is too much since other process will be running, but you set xmx to 60gb. Lower it to a reasonable amount and then see what happens. My guess is that your deserialisation process (and other things) have generated a lot of garbage, and there's no reason for the GC to run, since you're not even close to your xmx limit. – Joeri Hendrickx Aug 03 '16 at 08:21
  • @JoeriHendrickx 40 GB is too much because we need memory space for other processes to load, which were not loaded for this particular memory profiling scenario. 40 GB in memory space for a 7.4 GB on disk serialized object is too much. I may have introduced some ambiguity by saying other processes would be loaded. What I mean here is that since other processes would be loaded later on, so we can't afford to have lost so much memory (40 GB) to just the trie object. Just before this object was serialized, the memory consumption was of same order, so it may not be deserialization garbage, thanx – pandeconscious Aug 03 '16 at 09:50
  • @JoeriHendrickx by lowering Xmx, out of memory exception appears – pandeconscious Aug 03 '16 at 09:50
  • Very sloppy question. The memory consumption goes up *by* how much? *To* how much is of no interest, unless you also specify the starting point. In other words how much memory growth can be attributed to deserialising the object? – user207421 Nov 02 '17 at 23:46

1 Answers1

0

Serialized size on disk has little to do with the size of the data in memory. Every object in Java has some memory overhead (which can vary depending on the JVM mode and version). An single array of bytes would have serialized and deserialized to about the same size/memory. However, an array of billion 8 byte arrays would not.

If you create a heap dump of the data after deserializing the data you should be able to determine exactly where the memory is going.

How to collect heap dumps of any java process

What is the memory consumption of an object in Java?

Trick behind JVM's compressed Oops

John
  • 5,443
  • 15
  • 21