Curiosity and efficiency are the reasons for this question. I am in a situation where I am creating many new HashSets after certain loops run:
The HashSet is currently declared as such at the top of the class:
private Set<String> failedTests;
Then later in the code, I just create a new failedTests HashSet whenever I am re-running the tests:
failedTests = new HashSet<String>(16384);
I do this over and over, depending on the size of the test. I expect the garbage collector to most efficiently handle the old data. But, I know another option would be to create the HashSet initially in the beginning:
private Set<String> failedTests = new HashSet<String>(16384);
and then clear the HashSet each time through the loop.
failedTests.clear();
My question is which is the most efficient way of doing this in terms of overhead, etc? I don't know what the clear() function is doing inside -- is it doing the same thing, sending the old data to the garbage collection, or is it doing something even more efficient? Also, I am giving the HashSet a large cushion of initial capacity, but if a test requires more than 2^14 elements, will the .clear()
function re-instantiate the HashSet to 16384?
To add, I found the source code to clear() here. So it is at least an O(n) operation of the worst case.
Using the clear function, I did a test process which finished in 565 seconds. Using the GC to handle it, the test finished in 506 seconds.
But its not a perfect benchmark because there are other external factors such as interfacing with the computer's and network's file system. But a full minute does feel pretty good indeed. Does anyone recommend a specific profiling system that will work on the line/method level? (I am using Eclipse Indigo)