1

In my application I have a couple thousand lightweight objects (which I would like to keep in memory). Each lightweight object references a heavy data object which I would like to be loaded on demand and garbage collected if the heap is full (and reloaded on demand again). So using JPA I've done something like,

public class LightWeight{
    @OneToOne(fetch = FetchType.LAZY)
    private HeavyWeight data;
    ....
}

Using FetchType.LAZY works fine for loading the HeavyWeight for the first time, but unfortunately but as the HeavyWeight is a normal reference, it never gets garbage collected and thus I'am running out of memory after a while.

Is there a JPA mechanism which does lazy fetching, "un"loading if the heap becomes full (like a WeakReference) and refetches the reference again if needed?

Btw: I'am using Spring Data JPA over Hibernate as implementation.

Update 1: Considering the comment about using a second level cache, what about doing relying on the cache and detaching the heavyweight objects immediately after fetching them? I.e. something like this ....

public class LightWeight{

        int heavyWeightId = ...;

        @Autowired
        HeavyWeightRepository heavyWeightRepository;

        public HeavyWeight getData(){
            HeavyWeight hv = heavyWeightRepository.findById(id);
            heavyWeightRepository.detach(hv); //using custom repository
            return hv;
        }
  }

In this case the HeavyWeight objects should be garbage collected once they are detached from the EntityManager (and are not reference from anywhere else), is this correct?

Update 2: I abandoned the idea of using SoftReferences. The main problem is that although releasing all references the entitymanager holds to the managed objects by either explicity clearing the EM or committing the transaction should allow the entities only referenced by the softreferences to be garbage collected in case memory becomes sparse, it does not work out in practice as one often runs in the "GC overhead limit exceeded" problem. So the way to go is to use a second-level cache as proposed in the comments.

Ueli Hofstetter
  • 2,409
  • 4
  • 29
  • 52

2 Answers2

1

I have not tried this myself, but what about implementing a JPA attribute converter that converts the heavy data object to a WeakReference upon loading.

Also a WeakReference may seem to weak for this use case, maybe a SoftReference is better?

ghdalum
  • 891
  • 5
  • 17
  • You are absolutely right, the WeakReference is useless and a SoftReference would be the way to go ... unfortunately using them I am running into the "GC overhead limit exceeded problem"... thinking about using the proposed ehcache... – Ueli Hofstetter Jul 31 '15 at 23:58
1

As already pointed out by ghdalum, Weak or SoftReferences are a proper way to handle this GC problem in plain Java.

Regarding JPA I see two more possible problems:

  1. the caching of your persistence provider (also they probably use SoftReferences as well)
  2. keeping the entity managed by your persistence context

1

Did you try to set javax.persistence.sharedCache.mode to DISABLE_SELECTIVE and set @Cacheable(false) on your HeavyWeight class?

You could also try to manually clear the Cache with e.g.

javax.persistence.Cache c = myEntityManagerFactory.getCache()
c.evict(heavyWeight)  // does not clear heavyWeight.references

2

  • Detach the entity or
  • close the EntityManager.

If possible, I would disable caching selectively and let LightWeight return a copy of HeavyWeight so you have full control of its lifecycle (e.g. use a cache). Another idea would be to use JPQL Constructor Expressions to fetch a copy of HeavyWeights data when needed.

Community
  • 1
  • 1
Stefan K.
  • 7,701
  • 6
  • 52
  • 64
  • Thanks for the comment (+1). With respect to 1: I now use springs cache abstraction and ehcache as second level cache and doing so I managed to achieve the behaviour I tried to get using soft references. With respect to 2: From my observation, detaching an entity only makes it available for gc once the transaction has finished. as I couldn't find any documentation for this behaviour, I rather not rely on it.... – Ueli Hofstetter Aug 03 '15 at 11:32