2

Suppose all the objects in my problem setting has an ID field, and I have a global map <ID, num> that records the number of references that an object has. I have some other data (they are indexed by the ID) that may occupy a lot of memory. Therefore, I need to regularly clear the data if the number of references to its belonging object is 0. How the map is updated is illustrated in the following example:

public void foo() {
  MyClass obj1 = new MyClass();
  // suppose obj.ID = 1;
  // Now the globe map contains <1, 1> because the MyClass instance, whose ID is 1, has a reference obj1
  MyClass obj2 = obj1;
  // Now the globe map contains <1, 2> because the MyClass instance has another reference obj2
}
// When foo() terminates, the globe map contains <1, 0> because we can never access the MyClass instance through obj1 and obj2

The only problem I met is when an object is used as an argument to a system method invocation (e.g., list.add(obj)), how can I know whether the object is referenced by something in the system method? Take the list.add(obj) as an example, obviously obj is referenced by something in the add method. Otherwise, you can not use list.get(i) to get that object. Since the source code of the add method is not in my application scope, I don't know how to update the counter in this case.

One solution is that we can delay the reduction of the counter of obj when the reference to the list is 0. But it is not a best solution in my problem setting. I hope to know if there are better solutions. Thanks in advance!

Eugene
  • 117,005
  • 15
  • 201
  • 306
Richard Hu
  • 811
  • 5
  • 18

2 Answers2

2

Regarding your direct question on getting a reference count, see the Question: Is it possible to get the object reference count?, and its duplicate, How do I find out how many references an object has?. Also, Java Manual-Reference-Counting. For your wider question on caching data, read on.

Weak reference

You are recreating garbage collection. Java already performs garbage collection, automatically, in the background.

You said:

I have some other data (they are indexed by the ID) that may occupy a lot of memory.

That other data should be connected to the ID objects via weak reference.

When all the regular (“strong”) references to an object are gone, that object becomes a candidate for garbage collection. Eventually, the garbage collector clears that object from memory – regardless of any weak references that may point to the object being cleared.

If your big data has no references pointing to it, or if your big data has only weak references pointing to it, your big data becomes a candidate for garbage collection. Eventually your big data will be automatically cleared from memory. No need for you to be concerned or involved.

Consider using the WeakHashMap class to associate your big data with your Id objects. When one of your Id key objects becomes a candidate for garbage collection, the entry in the weak map is effectively removed. If the big data object held as the value in that map entry has no other strong references, it too automatically becomes a candidate for garbage collection.

Be sure both of your classes properly override the Object methods equals and hashCode.

Map< Id, BigData > mapOfIdToBigData = new WeakHashMap<>() ;
mapOfIdToBigData.put( someId , someBigData ) ;

See this Answer for links to learn more about weak and phantom references.

You might want to explore other implementations of weak caching. For example, Google Guava offers an implementation; see this brief introduction. Perhaps Eclipse Collections offers something, I don’t know.

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
2

As the other answer noted, you could use a WeakHashMap and don't track that on your own. But you need to carefully understand how it works and how a garbage collection cycle affects (or not) your Map.

How a WeakHashMap works can be found here. The point there that you should take is that the internal clean-up of a WeakHashMap happens on the next method calls on that said map, see this Q&A. And when you read that and figure out how it works, you still need to understand that not every garbage collection cycle can affect the overall logic.

Since G1 (or Shenandoah for example) splits the heap in regions, and not all regions are collected in a certain cycle (and your instance might not be in this current regions that gc is working on), you have no real timings or guarantees when that event (from the previous link) will be posted to the ReferenceQueue. And even then, recall that the removal happens when you actually call some method on your Map after the even is posted. You might want to review some answers like this one

Eugene
  • 117,005
  • 15
  • 201
  • 306