The reason is slow is because Hibernate executes the SQL query with no pagination at all and the restriction is done in memory.
However, if the join has to scan and fetch 100k records, while you are interested in just 100 results, then 99.9% of the work being done by the Extractor and all the I/O done over networking is just waste.
You can easily turn a JPQL query that uses both JOIN FETCH
and pagination:
List<Post> posts = entityManager.createQuery("""
select p
from Post p
left join fetch p.comments
where p.title like :title
order by p.id
""", Post.class)
.setParameter("title", titlePattern)
.setMaxResults(maxResults)
.getResultList();
into an SQL query that limits the result using DENSE_RANK
by the parent identifier:
@NamedNativeQuery(
name = "PostWithCommentByRank",
query =
"SELECT * " +
"FROM ( " +
" SELECT *, dense_rank() OVER (ORDER BY \"p.created_on\", \"p.id\") rank " +
" FROM ( " +
" SELECT p.id AS \"p.id\", " +
" p.created_on AS \"p.created_on\", " +
" p.title AS \"p.title\", " +
" pc.id as \"pc.id\", " +
" pc.created_on AS \"pc.created_on\", " +
" pc.review AS \"pc.review\", " +
" pc.post_id AS \"pc.post_id\" " +
" FROM post p " +
" LEFT JOIN post_comment pc ON p.id = pc.post_id " +
" WHERE p.title LIKE :titlePattern " +
" ORDER BY p.created_on " +
" ) p_pc " +
") p_pc_r " +
"WHERE p_pc_r.rank <= :rank ",
resultSetMapping = "PostWithCommentByRankMapping"
)
@SqlResultSetMapping(
name = "PostWithCommentByRankMapping",
entities = {
@EntityResult(
entityClass = Post.class,
fields = {
@FieldResult(name = "id", column = "p.id"),
@FieldResult(name = "createdOn", column = "p.created_on"),
@FieldResult(name = "title", column = "p.title"),
}
),
@EntityResult(
entityClass = PostComment.class,
fields = {
@FieldResult(name = "id", column = "pc.id"),
@FieldResult(name = "createdOn", column = "pc.created_on"),
@FieldResult(name = "review", column = "pc.review"),
@FieldResult(name = "post", column = "pc.post_id"),
}
)
}
)
The query can be executed like this:
List<Post> posts = entityManager
.createNamedQuery("PostWithCommentByRank")
.setParameter(
"titlePattern",
"High-Performance Java Persistence %"
)
.setParameter(
"rank",
5
)
.unwrap(NativeQuery.class)
.setResultTransformer(
new DistinctPostResultTransformer(entityManager)
)
.getResultList();
To transform the tabular result set back into an entity graph, you need a ResultTransformer
which looks as follows:
public class DistinctPostResultTransformer
extends BasicTransformerAdapter {
private final EntityManager entityManager;
public DistinctPostResultTransformer(
EntityManager entityManager) {
this.entityManager = entityManager;
}
@Override
public List transformList(
List list) {
Map<Serializable, Identifiable> identifiableMap =
new LinkedHashMap<>(list.size());
for (Object entityArray : list) {
if (Object[].class.isAssignableFrom(entityArray.getClass())) {
Post post = null;
PostComment comment = null;
Object[] tuples = (Object[]) entityArray;
for (Object tuple : tuples) {
if(tuple instanceof Identifiable) {
entityManager.detach(tuple);
if (tuple instanceof Post) {
post = (Post) tuple;
}
else if (tuple instanceof PostComment) {
comment = (PostComment) tuple;
}
else {
throw new UnsupportedOperationException(
"Tuple " + tuple.getClass() + " is not supported!"
);
}
}
}
if (post != null) {
if (!identifiableMap.containsKey(post.getId())) {
identifiableMap.put(post.getId(), post);
post.setComments(new ArrayList<>());
}
if (comment != null) {
post.addComment(comment);
}
}
}
}
return new ArrayList<>(identifiableMap.values());
}
}
That's it!