I have an application that uses a lot of memory diff'ing the contents of two potentially huge (100k+) directories. It makes sense to me that such an operation would use a lot of memory, but once my diff'ing operation is done, the heap remains the same size.
I basically have code that instantiates a class to store the filename, file size, path, and modification date for each file on the source and target. I save the additions, deletions, and updates in other arrays. I then clear()
my source and target arrays (which could be 100k+ each by now), leaving relatively small additions, deletions, and updates arrays left.
After I clear()
my target and source arrays though, the memory usage (as visible via VirtualVM and Windows Task Manager) doesn't drop. I'm not experienced enough with VirtualVM (or any profiler for that matter) to figure out what is taking up all this memory. VirtualVM's heap dump lists the top few objects with a retained size of a few megabytes.
Anything to help point me in the right direction?