So i have an application that creates like 2000 objects.
For each object, it downloads a web page (String of approx. 75kb), creates a DOM document object model of the entire html tree and discards the String (it goes out of scope).
It then extracts some text and links from the DOM, and discards the DOM (by setting it to null).
After about 1000 objects (depending on how much applications I have open, it might be after 50 objects) I get an OutOfMemory exception, and with Process Explorer I can see the memory footprint has been increasing throughout, in logarithmic steps.
I tried inserting a System.gc();
after setting it to null, but memory usage still keeps increasing, but now not with logarithmic steps but with steps of approx 0.5Mb after every processed object. Furthermore, while debugging, whenever I step over System.gc()
the footprint increases by this amount, and it stays the same until the instruction pointer is at the same System.gc()
again.
[edit]
I ran the profiles on a dump as suggested in an answer, and found that every of those classes still stores a 150kb string (75k chars). This totals 242mb. So the question becomes, how do I keep the substrings without keeping the original string? Apparently the String constructor does this.