11

I encountered some troubles with WeakHashMap.

Consider this sample code:

List<byte[]> list = new ArrayList<byte[]>();

Map<String, Calendar> map = new WeakHashMap<String, Calendar>();
String anObject = new String("string 1");
String anOtherObject = new String("string 2");

map.put(anObject, Calendar.getInstance());
map.put(anOtherObject, Calendar.getInstance());
// In order to test if the weakHashMap works, i remove the StrongReference in this object
anObject = null;
int i = 0;
while (map.size() == 2) {
   byte[] tab = new byte[10000];
   System.out.println("iteration " + i++ + "map size :" + map.size());
   list.add(tab);
}
System.out.println("Map size " + map.size());

This code works. Inside the loops, i'm creating object.When a minor GC occurs, the map size is equal to 1 at the 1360th iteration. All is OK.

Now when i comment this line:

//anObject = null; 

I expect to have an OutOfMemoryError because the mapSize is always equal to 2. However at the 26XXX th iteration, a full GC occurs and the map size is equal to 0. I dont understand why?

I thought that the map shouldn't have cleared because there are also strong references to both objects.

Michael Berry
  • 70,193
  • 21
  • 157
  • 216
Slade
  • 133
  • 7
  • I think your test is not correct. If you change `while (map.size() == 2) {` to `while (map.size() > 0) {`, the two test will both end until the map is empty, no matter you comment `anObject = null` or not. BTW, I've already tried it. – donnior Jan 11 '12 at 11:47
  • Print `anObject` and `anOtherObject` at the end. Compiler sees that you are no longer using them and can remove them earlier. – Piotr Praszmo Jan 11 '12 at 11:48

3 Answers3

10

The just-in-time compiler analyzes the code, sees that anObject and anOtherObject are not used after the loop, and removes them from the local variable table or sets them to null, while the loop is still running. This is called OSR compilation.

Later the GC collects the strings because no strong references to them remain.

If you used anObject after the loop you'd still get an OutOfMemoryError.

Update: You'll find a more detailed discussion about OSR compilation in my blog.

Joni
  • 108,737
  • 14
  • 143
  • 193
  • I think you're exactly right - but is this not a potentially breaking JIT optimisation? If `anObject` has a finalizer and it's GC'd before that reference disappears the finalizer will execute potentially before it's meant to. – Michael Berry Jan 11 '12 at 12:02
  • What could it break? When the finalizer is run the strong reference no longer exists. – Joni Jan 11 '12 at 12:09
  • It could break in the sense it could run a finalizer sooner than you expect it to; before the hard reference has actually gone out of scope. – Michael Berry Jan 11 '12 at 12:11
  • Good point. I guess the lesson is that you can't rely in any way on when (or _if_) finalizers are run. It can be sooner than you expect, not just later. – Joni Jan 11 '12 at 12:43
7

Bit of digging reveals that this is explicitly covered in the JLS, section 12.6.1:

Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.

(Bolding is my addition.)

http://java.sun.com/docs/books/jls/third_edition/html/execution.html#12.6.1

So in essence, the JIT is allowed to remove strong references whenever it wants if it can work out that they'll never be used again - which is exactly what's happening here.

This is a great question though and makes for a great puzzler that can easily show just because an object appears to have a strong reference in scope, doesn't necessarily mean it hasn't been garbage collected. Following on from this it means you explicitly can't guarantee anything about when a finalizer will run, this may even be in the case where it seems like the object is still in scope!

Eg:

List<byte[]> list = new ArrayList<byte[]>();

Object thing = new Object() {
    protected void finalize() {
        System.out.println("here");
    }
};
WeakReference<Object> ref = new WeakReference<Object>(thing);

while(ref.get()!=null) {
    list.add(new byte[10000]);
}
System.out.println("bam");

The above is a simpler example that shows the object gets finalized and GC'd first even though the reference to thing still exists (here is printed, then bam.)

Michael Berry
  • 70,193
  • 21
  • 157
  • 216
7

Just to add a little thing to the excellent answers from Joni Salonen and berry120. It can be shown that the JIT is actually the responsible for the "variable removing" simply turning it off with -Djava.compiler=NONE. Once you turn it off, you get the OOME.

If we want to know what is happening under the hoods, the option XX:+PrintCompilation shows the JIT activity. Using it with the code from the question the output we get is the following:

1       java.lang.String::hashCode (64 bytes)
2       java.lang.String::charAt (33 bytes)
3       java.lang.String::indexOf (151 bytes)
4       java.util.ArrayList::add (29 bytes)
5       java.util.ArrayList::ensureCapacity (58 bytes)
6  !    java.lang.ref.ReferenceQueue::poll (28 bytes)
7       java.util.WeakHashMap::expungeStaleEntries (125 bytes)
8       java.util.WeakHashMap::size (18 bytes)
1%      WeakHM::main @ 63 (126 bytes)
Map size 0

The last compilation (with the @ flag) is a OSR (On Stack Replacement) compilation (check https://gist.github.com/rednaxelafx/1165804#osr for further details). In simple words, it enables the VM to replace a method while it is running and it is used to improve performance of Java methods stuck in loops. I would guess that after this compilation is triggered, the JIT removes the variables that are no longer used.

Clashsoft
  • 11,553
  • 5
  • 40
  • 79
jalopaba
  • 8,039
  • 2
  • 44
  • 57