3

I implemented pageable functionality into Criteria API query and I noticed increased memory usage during query execution. I also used spring-data-jpa method query to return same result, but there memory is cleaned up after every batch is processed. I tried detaching, flushing, clearing objects from EntityManager, but memory use would keep going up, occasionally it will drop but not as much as with method queries. My question is what could cause this memory use if objects are detached and how to deal with it?

Memory usage with Criteria API pageable: enter image description here

Memory usage with method query:

enter image description here

Code

Since I'm also updating entities retrieved from DB, I use approach where I save ID of last processed entity, so when entity gets updated query doesen't skip next selected page. Below I provide code example that is not from real app I'm working on, but it just recreation of the issue I'm having.

Repository code:

@Override
public Slice<Player> getPlayers(int lastId, Pageable pageable) {
    List<Predicate> predicates = new ArrayList<>();

    CriteriaBuilder criteriaBuilder = entityManager.getCriteriaBuilder();
    CriteriaQuery<Player> criteriaQuery = criteriaBuilder.createQuery(Player.class);
    Root<Player> root = criteriaQuery.from(Player.class);

    predicates.add(criteriaBuilder.greaterThan(root.get("id"), lastId));

    criteriaQuery.where(criteriaBuilder.and(predicates.toArray(Predicate[]::new)));
    criteriaQuery.orderBy(criteriaBuilder.asc(root.get("id")));

    var query = entityManager.createQuery(criteriaQuery);

    if (pageable.isPaged()) {
        int pageSize = pageable.getPageSize();
        int offset = pageable.getPageNumber() > 0 ? pageable.getPageNumber() * pageSize : 0;

        // Fetch additional element and skip it based on the pageSize to know hasNext value.
        query.setMaxResults(pageSize + 1);
        query.setFirstResult(offset);

        var resultList = query.getResultList();

        boolean hasNext = pageable.isPaged() && resultList.size() > pageSize;
        return new SliceImpl<>(hasNext ? resultList.subList(0, pageSize) : resultList, pageable, hasNext);
    } else {
        return new SliceImpl<>(query.getResultList(), pageable, false);
    }
}

Iterating through pageables:

@Override
public Slice<Player> getAllPlayersPageable() {
    int lastId = 0;
    boolean hasNext = false;
    Pageable pageable = PageRequest.of(0, 200);
    do {
        var players = playerCriteriaRepository.getPlayers(lastId, pageable);

        if(!players.isEmpty()){
            lastId = players.getContent().get(players.getContent().size() - 1).getId();

            for(var player : players){
                System.out.println(player.getFirstName());
                entityManager.detach(player);
            }
        }
        hasNext = players.hasNext();
    } while (hasNext);
    return null;
}
Gilgalad
  • 53
  • 6
  • 1
    Can you attach a profile and let it run for so long that what ever causes the memory leak takes up a large percentage of the memory? After that the profile should be able to point out the objects filling up that memory and also the reference change preventing them from getting GCed. That should help identifying the root cause. – Jens Schauder Aug 30 '21 at 10:18
  • Ran on 52m entities and it says primary suspect is: One instance of "org.hibernate.internal.SessionFactoryImpl" loaded by "jdk.internal.loader.ClassLoaders$AppClassLoader @ 0x68139d970" occupies 29,75 MB (60,38%). Though I couldn't trace it to any known class I had in that code. The last thing I noticed inside the tree that is using all that memory is: org.hibernate.internal.util.collections.BoundedConcurrentHashMap$Segment[32] @ 0x682175aa0 – Gilgalad Aug 31 '21 at 10:48
  • Can you find out, what is stored in that BoundedConcurrentHashMap? – Jens Schauder Aug 31 '21 at 11:43

1 Answers1

0

I think you are running into a query plan cache issue here that is related to the use of the JPA Criteria API and how numeric values are handled. Hibernate will render all numeric values as literals into an intermediary HQL query string which is then compiled. As you can imagine, every "scroll" to the next page will be a new query string so you gradually fill up the query plan cache.

One possible solution is to use a library like Blaze-Persistence which has a custom JPA Criteria API implementation and a Spring Data integration that will avoid these issues and at the same time improve the performance of your queries due to a better pagination implementation.

All your code would stay the same, you just have to include the integration and configure it as documented in the setup section.

Christian Beikov
  • 15,141
  • 2
  • 32
  • 58