How can I avoid the Warning "firstResult/maxResults specified with collection fetch; applying in memory!" when using Hibernate?

Question

I'm getting a warning in the Server log "firstResult/maxResults specified with collection fetch; applying in memory!". However everything working fine. But I don't want this warning.

My code is

public employee find(int id) {
    return (employee) getEntityManager().createQuery(QUERY).setParameter("id", id).getSingleResult();
}

My query is

QUERY = "from employee as emp left join fetch emp.salary left join fetch emp.department where emp.id = :id"

This is the code and query in my application. I don't set maxResult manually. — Gnik, Jul 11 '12 at 13:17
This means that Hibernate is fetching everything and then **trying to apply the first/max result restrictions in memory**. This may seem undesirable to you. Here's what the JPA spec has to say about this kind of interaction: "The effect of applying `setMaxResults` or `setFirstResult` to a query involving fetch joins over collections is undefined. ([JPA "Enterprise JavaBeans 3.0, Final Release"](http://download.oracle.com/otndocs/jcp/ejb-3_0-fr-eval-oth-JSpec/), Kapitel 3.6.1 Query Interface)" — naXa stands with Ukraine, Sep 14 '17 at 10:49
@Prince how you finally solve the problem? mind sharing with me, because I have the exact isisue — msmaromi, Jun 18 '20 at 07:52

Vlad Mihalcea · Answer 1 · 2021-01-09T09:28:46.587

Although you are getting valid results, the SQL query fetches all data and it's not as efficient as it should.

So, you have two options.

Fixing the issue with two SQL queries that can fetch entities in read-write mode

The easiest way to fix this issue is to execute two queries:

. The first query will fetch the root entity identifiers matching the provided filtering criteria. . The second query will use the previously extracted root entity identifiers to fetch the parent and the child entities.

This approach is very easy to implement and looks as follows:

List<Long> postIds = entityManager
.createQuery(
    "select p.id " +
    "from Post p " +
    "where p.title like :titlePattern " +
    "order by p.createdOn", Long.class)
.setParameter(
    "titlePattern",
    "High-Performance Java Persistence %"
)
.setMaxResults(5)
.getResultList();
 
List<Post> posts = entityManager
.createQuery(
    "select distinct p " +
    "from Post p " +
    "left join fetch p.comments " +
    "where p.id in (:postIds) "  +
    "order by p.createdOn", Post.class)
.setParameter("postIds", postIds)
.setHint(
    "hibernate.query.passDistinctThrough", 
    false
)
.getResultList();

Fixing the issue with one SQL query that can only fetch entities in read-only mode

The second approach is to use SDENSE_RANK over the result set of parent and child entities that match our filtering criteria and restrict the output for the first N post entries only.

The SQL query can look as follows:

@NamedNativeQuery(
    name = "PostWithCommentByRank",
    query =
        "SELECT * " +
        "FROM (   " +
        "    SELECT *, dense_rank() OVER (ORDER BY \"p.created_on\", \"p.id\") rank " +
        "    FROM (   " +
        "        SELECT p.id AS \"p.id\", " +
        "               p.created_on AS \"p.created_on\", " +
        "               p.title AS \"p.title\", " +
        "               pc.id as \"pc.id\", " +
        "               pc.created_on AS \"pc.created_on\", " +
        "               pc.review AS \"pc.review\", " +
        "               pc.post_id AS \"pc.post_id\" " +
        "        FROM post p  " +
        "        LEFT JOIN post_comment pc ON p.id = pc.post_id " +
        "        WHERE p.title LIKE :titlePattern " +
        "        ORDER BY p.created_on " +
        "    ) p_pc " +
        ") p_pc_r " +
        "WHERE p_pc_r.rank <= :rank ",
    resultSetMapping = "PostWithCommentByRankMapping"
)
@SqlResultSetMapping(
    name = "PostWithCommentByRankMapping",
    entities = {
        @EntityResult(
            entityClass = Post.class,
            fields = {
                @FieldResult(name = "id", column = "p.id"),
                @FieldResult(name = "createdOn", column = "p.created_on"),
                @FieldResult(name = "title", column = "p.title"),
            }
        ),
        @EntityResult(
            entityClass = PostComment.class,
            fields = {
                @FieldResult(name = "id", column = "pc.id"),
                @FieldResult(name = "createdOn", column = "pc.created_on"),
                @FieldResult(name = "review", column = "pc.review"),
                @FieldResult(name = "post", column = "pc.post_id"),
            }
        )
    }
)

The @NamedNativeQuery fetches all Post entities matching the provided title along with their associated PostComment child entities. The DENSE_RANK Window Function is used to assign the rank for each Post and PostComment joined record so that we can later filter just the amount of Post records we are interested in fetching.

The SqlResultSetMapping provides the mapping between the SQL-level column aliases and the JPA entity properties that need to be populated.

Now, we can execute the PostWithCommentByRank @NamedNativeQuery like this:

List<Post> posts = entityManager
.createNamedQuery("PostWithCommentByRank")
.setParameter(
    "titlePattern",
    "High-Performance Java Persistence %"
)
.setParameter(
    "rank",
    5
)
.unwrap(NativeQuery.class)
.setResultTransformer(
    new DistinctPostResultTransformer(entityManager)
)
.getResultList();

Now, by default, a native SQL query like the PostWithCommentByRank one would fetch the Post and the PostComment in the same JDBC row, so we will end up with an Object[] containing both entities.

However, we want to transform the tabular Object[] array into a tree of parent-child entities, and for this reason, we need to use the Hibernate ResultTransformer.

The DistinctPostResultTransformer looks as follows:

public class DistinctPostResultTransformer
        extends BasicTransformerAdapter {
 
    private final EntityManager entityManager;
 
    public DistinctPostResultTransformer(
            EntityManager entityManager) {
        this.entityManager = entityManager;
    }
 
    @Override
    public List transformList(
            List list) {
             
        Map<Serializable, Identifiable> identifiableMap =
            new LinkedHashMap<>(list.size());
             
        for (Object entityArray : list) {
            if (Object[].class.isAssignableFrom(entityArray.getClass())) {
                Post post = null;
                PostComment comment = null;
 
                Object[] tuples = (Object[]) entityArray;
 
                for (Object tuple : tuples) {
                    if(tuple instanceof Identifiable) {
                        entityManager.detach(tuple);
 
                        if (tuple instanceof Post) {
                            post = (Post) tuple;
                        }
                        else if (tuple instanceof PostComment) {
                            comment = (PostComment) tuple;
                        }
                        else {
                            throw new UnsupportedOperationException(
                                "Tuple " + tuple.getClass() + " is not supported!"
                            );
                        }
                    }
                }
 
                if (post != null) {
                    if (!identifiableMap.containsKey(post.getId())) {
                        identifiableMap.put(post.getId(), post);
                        post.setComments(new ArrayList<>());
                    }
                    if (comment != null) {
                        post.addComment(comment);
                    }
                }
            }
        }
        return new ArrayList<>(identifiableMap.values());
    }
}

The DistinctPostResultTransformer must detach the entities being fetched because we are overwriting the child collection and we don’t want that to be propagated as an entity state transition:

post.setComments(new ArrayList<>());

Isn't the sorting missing in second query (the one loading the "Posts")? Can we rely on that the database returns the Posts in the same order the ids are given? I think one can simply add `order by p.createdOn` to the second query as well... — Udo, Jan 27 '20 at 09:19
The posts are already fetched in the right order provided by the first query, and you can use that `List` reference to retain the order. The second query just replaces the uninitialized child association with a non-proxy `List`. — Vlad Mihalcea, Jan 27 '20 at 09:29
Maybe it depends on the database? In Postgres I had to add the ORDER BY clause to the second query as well. Also https://stackoverflow.com/q/866465/716834 seems to confirm this... — Udo, Jan 28 '20 at 10:09
Does anyone know how to make this work with a custom predicate? I am trying to build a search endpoint. But it seems as though I can't combine custom predicate, page request, and native SQL all at once... — wmakley, May 11 '20 at 13:56

score 21 · Accepted Answer · answered Jul 11 '12 at 11:45

21

Reason for this warning is that when fetch join is used, order in result sets is defined only by ID of selected entity (and not by join fetched).

If this sorting in memory is causing problems, do not use firsResult/maxResults with JOIN FETCH.

answered Jul 11 '12 at 11:45

Mikko Maunu

41,366
10
132
135

score 21 · Answer 3 · answered Aug 08 '12 at 12:45

21

To avoid this WARNING you have to change the call getSingleResult to getResultList().get(0)

answered Aug 08 '12 at 12:45

Lucas

291
2
8

2

Though the votes might already indicate this: this is indeed true. Even for queries that do not use any pagination such as `setFirstResult(offset)` or `setMaxResults(limit)`, Hibernate might still complain about that when using `getSingleResult`. (I assume Hibernate is limiting internally.) – Arjan Jan 15 '13 at 09:50
6

A notable difference is that this will not throw an exception when a query that you expected to return 1 result actually returned multiple results. I'd add some kind of assertion. – Arnout Engelen May 19 '14 at 09:21
1

I'm getting this warning with JOIN FETCH despite already using getResultList(). So this can't be the *only* potential solution to this problem. – Alkanshel Feb 27 '17 at 20:15

score 17 · Answer 4 · answered Apr 25 '20 at 08:24

This warning tells you Hibernate is performing in memory java pagination. This can cause high JVM memory consumption. Since a developer can miss this warning, I contributed to Hibernate by adding a flag allowing to throw an exception instead of logging the warning (https://hibernate.atlassian.net/browse/HHH-9965).

The flag is hibernate.query.fail_on_pagination_over_collection_fetch.

I recommend everyone to enable it.

The flag is defined in org.hibernate.cfg.AvailableSettings :

    /**
     * Raises an exception when in-memory pagination over collection fetch is about to be performed.
     * Disabled by default. Set to true to enable.
     *
     * @since 5.2.13
     */
    String FAIL_ON_PAGINATION_OVER_COLLECTION_FETCH = "hibernate.query.fail_on_pagination_over_collection_fetch";

score 1 · Answer 5 · answered Dec 19 '13 at 11:42

1

the problem is you will get cartesian product doing JOIN. The offset will cut your recordset without looking if you are still on same root identity class

answered Dec 19 '13 at 11:42

Benjamin Fuentes

674
9
22

score 0 · Answer 6 · answered Jun 12 '14 at 00:53

I guess the emp has many departments which is a One to Many relationship. Hibernate will fetch many rows for this query with fetched department records. So the order of result set can not be decided until it has really fetch the results to the memory. So the pagination will be done in memory.

If you do not want to fetch the departments with emp, but still want to do some query based on the department, you can achieve the result with out warning (without doing ordering in the memory). For that simply you have to remove the "fetch" clause. So something like as follows:

QUERY = "from employee as emp left join emp.salary sal left join emp.department dep where emp.id = :id and dep.name = 'testing' and sal.salary > 5000 "

score 0 · Answer 7 · answered Jan 18 '22 at 03:30

As others pointed out, you should generally avoid using "JOIN FETCH" and firstResult/maxResults together.
If your query requires it, you can use .stream() to eliminate warning and avoid potential OOM exception.

try (Stream<ENTITY> stream = em.createQuery(QUERY).stream()) {
    ENTITY first = stream.findFirst().orElse(null); // equivalents .getSingleResult()
}

// Stream returned is an IO stream that needs to be closed manually.

How can I avoid the Warning "firstResult/maxResults specified with collection fetch; applying in memory!" when using Hibernate?

My code is

My query is

7 Answers7

Fixing the issue with two SQL queries that can fetch entities in read-write mode

Fixing the issue with one SQL query that can only fetch entities in read-only mode

Linked

Related