3

A few weeks back I wrote a Java class with the following behavior:

  • Each object contains a single final integer field
  • The class contains a static Map (Key: Integer, Content: MyClass)
  • Whenever an object of the class is instantiated a look-up is done, if an object with the wanted integer field already exists in the static map: return it, otherwise create one and put it in the map.

As code:

public class MyClass
{
    private static Map<Integer, MyClass> map;
    private final int field;

    static
    {
        map = new HashMap<>();
    }

    private MyClass(int field)
    {
        this.field = field;
    }

    public static MyClass get(int field)
    {
        synchronized (map)
        {
            return map.computeIfAbsent(field, MyClass::new);
        }
    }
}

This way I can be sure, that only one object exists for each integer (as field). I'm currently concerned, that this will prevent the GC to collect objects, which I no longer need, since the objects are always stored in the map (a reference exists)...

If I wrote a loop like function like this:

public void myFunction() {
    for (int i = 0; i < Integer.MAX_VALUE; i++) {
       MyClass c = MyClass.get(i);
       // DO STUFF
    } 
}

I would end up with Integer.MAX_VALUE objects in memory after calling the method. Is there a way I can check, whether references to objects in the map exists and otherwise remove them?

John
  • 31
  • 3
  • Usually this technique is used to have some sort of cache to avoid creating objects when they were already created. Isn't this your goal? Because in this case it make sense that they remain allocated till the end of the program. – Loris Securo Jul 10 '16 at 14:37
  • You might want to use a soft or weak reference. E.g. see [What is the difference between a soft reference and a weak reference in Java?](http://stackoverflow.com/q/299659/5221149) Alternatively, use a 3rd-party cache library, e.g. see [Looking for simple Java in-memory cache](http://stackoverflow.com/q/575685/5221149). – Andreas Jul 10 '16 at 14:39
  • @LorisSecuro Yes, thats my goal. But I would prefer to cache maybe up to say *1000* objects and after that only cache, what is currently required/referenced. – John Jul 10 '16 at 14:44
  • @Andreas Thank you, I'll take a look :) – John Jul 10 '16 at 14:44
  • @John perhaps you should try guava cache – Vladislav Kysliy Jul 10 '16 at 15:01
  • You ask: " Is there a way I can check, whether references to objects in the map exists and otherwise remove them?" Either I don't get it or it makes no sense. Perhaps both ;-) Please clarify your question. – martinhh Jul 10 '16 at 15:22
  • @martinhh I want to check if references to the objects in the map exists. If no references exists, I want to remove the objects from the map (the GC will do the rest). – John Jul 10 '16 at 15:27
  • @John Ok, I try to understand... Your theory is that if some other class holds a reference to one of the cached objects, the object has to stay in the cache and if no other class holds a reference, the cached object can be removed from the cache. Is it not exactly the other way round? A cache normally serves clients that do not yet have a reverence but want to have one and not those that already have one. – martinhh Jul 10 '16 at 15:35
  • @martinhh You are right, this seems unintuitive (when thinking about a cache), but I would like to keep up to maybe 1000 objects in the cache and afterwards start to remove elements (the cache holds a maximum of 1000 *unreferenced* objects). – John Jul 10 '16 at 15:49

2 Answers2

3

This looks like a typical case of the multiton pattern: You want to have at most one instance of MyClass for a given key. However, you also seem to want to limit the amount of instances created. This is very easy to do by lazily instantiating your MyClass instances as you need them. Additionally, you want to clean up unused instances:

Is there a way I can check, whether references to objects in the map exists and otherwise remove them?

This is exactly what the JVM's garbage collector is for; There is no reason to try to implement your own form of "garbage collection" when the Java core library already provides tools for marking certain references as "not strong", i.e. should refer to a given object only if there is a strong reference (i.e. in Java, a "normal" reference) somewhere referring to it.

Implementation using Reference objects

Instead of a Map<Integer, MyClass>, you should use a Map<Integer, WeakReference<MyClass>> or a Map<Integer, SoftReference<MyClass>>: Both WeakReference and SoftReference allow the MyClass instances they refer to to be garbage-collected if there are no strong (read: "normal") references to the object. The difference between the two is that the former releases the reference on the next garbage collection action after all strong references are gone, while the latter one only releases the reference when it "has to", i.e. at some point which is convenient for the JVM (see related SO question).

Plus, there is no need to synchronize your entire Map: You can simply use a ConcurrentHashMap (which implements ConcurrentMap), which handles multi-threading in a way much better than by locking all access to the entire map. Therefore, your MyClass.get(int) could look like this:

private static final ConcurrentMap<Integer, Reference<MyClass>> INSTANCES = new ConcurrentHashMap<>();

public static MyClass get(final int field) {
    // ConcurrentHashMap.compute(...) is atomic <https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html#compute-K-java.util.function.BiFunction->
    final Reference<MyClass> ref = INSTANCES.compute(field, (key, oldValue) -> {
        final Reference<MyClass> newValue;
        if (oldValue == null) {
            // No instance has yet been created; Create one
            newValue = new SoftReference<>(new MyClass(key));
        } else if (oldValue.get() == null) {
            // The old instance has already been deleted; Replace it with a
            // new reference to a new instance
            newValue = new SoftReference<>(new MyClass(key));
        } else {
            // The existing instance has not yet been deleted; Re-use it
            newValue = oldValue;
        }
        return newValue;
    });
    return ref.get();
}

Finally, in a comment above, you mentioned that you would "prefer to cache maybe up to say 1000 objects and after that only cache, what is currently required/referenced". Although I personally see little (good) reason for it, it is possible to perform eager instantiation on the "first" 1000 objects by adding them to the INSTANCES map on creation:

private static final ConcurrentMap<Integer, Reference<MyClass>> INSTANCES = createInstanceMap();

private static ConcurrentMap<Integer, Reference<MyClass>> createInstanceMap() {
    // The set of keys to eagerly initialize instances for
    final Stream<Integer> keys = IntStream.range(0, 1000).boxed();
    final Collector<Integer, ?, ConcurrentMap<Integer, Reference<MyClass>>> mapFactory = Collectors
            .toConcurrentMap(Function.identity(), key -> new SoftReference<>(new MyClass(key)));
    return keys.collect(mapFactory);
}

How you define which objects are the "first" ones is up to you; Here, I'm just using the natural order of the integer keys because it's suitable for a simple example.

Community
  • 1
  • 1
errantlinguist
  • 3,658
  • 4
  • 18
  • 41
  • Thank you, this is really cool actually! But since the keys are `Integer`-**objects**, don't they need to be a `SoftReference` as well? – John Jul 14 '16 at 14:58
  • `SoftReference` doesn't work well as a `(Hash)Map` key because it doesn't implement `hash()` or `equals()`. Besides, [`Integer` objects may be cached by the JVM](https://docs.oracle.com/javase/8/docs/api/java/lang/Integer.html#valueOf(int)) and so the same object may be used all over the place in your application anyway, meaning that the likelihood of the key being automatically cleared due to it not being referenced anywhere else other than in the map is quite slim... – errantlinguist Jul 14 '16 at 16:41
  • ...Still, if you want a cache which clears its keys when they are not referenced elsewhere, there is a [`WeakHashMap`](https://docs.oracle.com/javase/8/docs/api/java/util/WeakHashMap.html) for that purpose. – errantlinguist Jul 14 '16 at 16:41
  • I saw your comments above and added an example of how you might actively initialize your "first" 1000 objects. – errantlinguist Jul 15 '16 at 12:54
1

Your function for examining your cache is cringe worthy. First, as you said, it creates all the cache objects. Second, it iterates Integer.MAX_VALUE times.

Better would be:

public void myFunction() {
    for(MyClass c : map.values()) {
       // DO STUFF
    } 
}

To the issue at hand: Is it possible to find out whether an Object has references to it?

Yes. It is possible. But you won't like it.

http://docs.oracle.com/javase/1.5.0/docs/guide/jvmti/jvmti.html

jvmtiError
IterateOverReachableObjects(jvmtiEnv* env,
        jvmtiHeapRootCallback heap_root_callback,
        jvmtiStackReferenceCallback stack_ref_callback,
        jvmtiObjectReferenceCallback object_ref_callback,
        void* user_data)

Loop over all reachable objects in the heap. If a MyClass object is reachable, then, well, it is reachable.

Of course, by storing the object in your cache, you are making it reachable, so you'd have to change your cache to WeakReferences, and see if you can exclude those from the iteration.

And you're no longer using pure Java, and jvmti may not be supported by all VM's.

As I said, you won't like it.

AJNeufeld
  • 8,526
  • 1
  • 25
  • 44
  • The function (wich iterates up to `Integer.MAX_VALUE`) is only an example, when the cache might fail and I'm well aware of how to iterate over a `java.util.map`. If you are interested, you could take a look at the answer above yours. – John Jul 14 '16 at 14:57