2

I have a production project, that uses pretty old Ebean ORM (came from Play Framework). Out team decided to look for a migration to newer tools. In our code we have a lot of ORM Models, and it is quite usual to have huge entity graphs (up to 20 OneToMany relations at one "nesting level", each nested up to 3 levels deep, which is A LOT of relations, that should be fetched eagerly to avoid N+1 problems). Our current framework allows us to write pretty neat code to fetch OneToMany relations, hypothetical example:

@Entity
public class A {
   @OneToMany
   private List<B> bs;

   @OneToMany
   private List<C> cs;
}

Query code:

Ebean.find(A.class)
     .fetch("bs", new FetchConfig().query())
     .fetch("cs", new FetchConfig().query())
     ... etc

That code would produce 3 database queries - one for class A, and two for relations; then Ebean would combine results of those queries automatically.

I tried to produce this kind of code in Hibernate ORM by using JPA Criteria API and NamedEntityGraphs, but could not succeed - it seems like Hibernate does not like having several OneToMany relations to be fetched at once (by producing something like MultipleBagFetchException). I understand why this exception is raised (cartessian product), but I can not find part of framework, that could split one entity graph in several database queries.

Is it possible to do in Hibernate? If no, are there any 3rd party dependencies, that could do so? How do Hibernate users deal with big entity graphs?

  • Does this answer your question? [How to use multiple JOIN FETCH in one JPQL query](https://stackoverflow.com/questions/30088649/how-to-use-multiple-join-fetch-in-one-jpql-query) – SternK Feb 23 '21 at 19:01
  • I did find this SO question while trying to solve this problem - unfortunately it does not seem convenient to perform N queries manually - I purposely mentioned that we have HUGE entity graphs, managing them manually would be a hassle. I assumed that there is a common solution to this problem, isn't it? If there is no such solution - then I would close this question. Thank you for your answer, in any case. – CrimsonAndRed Feb 23 '21 at 19:10
  • You can simply replace `List` type to `Set` type, and it will work, but it will lead to the **Cartesian Product problem**. See [this question](https://stackoverflow.com/questions/4334970/hibernate-throws-multiplebagfetchexception-cannot-simultaneously-fetch-multipl). – SternK Feb 23 '21 at 22:59
  • I understand it. I also mentioned that in my question. The question was about any alternative to Ebean functionality, which makes more than one query to eagerly fetch all needed data. Making `Set` instead of `List` would not make several database queries; whereas Ebean would. – CrimsonAndRed Feb 24 '21 at 06:52
  • This is a fundamental limitation of JPQL that is one of the key reasons Ebean was created in the first place. Not only does hibernate tend to generate Cartesian Product but it also does not honor maxRows in SQL so pagination then occurs on the client. Both of these are design limitations of JPQL. JPA (and Hibernate) only started to address this limitation with the introduction of FetchGroup and that is somewhat close to what we have with Ebean but Ebean gives us more control (fetch, fetch query, fetch cache, fetch lazy). Note: I'm the creator of Ebean. – Rob Bygrave Feb 24 '21 at 22:08
  • Regarding Set vs List. Hibernate desires Set over List to get it's preferred behavior (bag semantics). With Ebean we can use Set or List equally but ... using List is the recommendation as then we don't implicitly use equals() and hashcode() implementations. There is a bit more to this but in short, Hibernate prefers Set but that isn't related to Ebean's built in determination of "ToMany" paths and control for building complex object graphs.. – Rob Bygrave Feb 24 '21 at 23:55
  • Last time I checked FetchGroup support of Eclipselink and Hibernate Eclipselink did a decent job of getting close to Ebean. Hibernate didn't support "partial objects" (and still doesn't) so it's support of FetchGroup fell short. I didn't test DataNucleus. I'd be keen to hear how you go if you try latest Hibernate + FetchGroup on your complex graphs. – Rob Bygrave Feb 25 '21 at 00:03
  • Thank you very much for your comments! I personally thought that since Hibernate is "mature and powerful" that task is definitely solved. Sooner or later I would try out FetchGroup with Hibernate/Eclipselink/DataNucleus, but for now it seems like upgrading to current version of Ebean is our option, since we are using these kind of queries everywhere. It may be worthy to combine comments in regular answer. – CrimsonAndRed Feb 25 '21 at 11:35

2 Answers2

2

Firstly it is a fundamental limitation of JPQL that it doesn't truely support creating queries to build complex graphs [JPQL FETCH JOIN does not cut it and Hibernate makes a meal out of this by generating sql cartesian product etc]. This is one of the fundamental reasons why Ebean exists.

JPA added FetchGroup later and that takes you much closer to the capabilities of Ebean ORM's query language. You will need to try using FetchGroup with the JPQL query to see how close you get for your use cases.

Specific issues you can hit with Hibernate include:

  • Generating SQL cartesian product when 2 ToMany paths are fetched
  • Not honoring maxRows in SQL but instead performing client side pagination (so we no longer get the DB optimizing the query for max rows)
  • No equivalent support for large queries - Ebean's findEach() that manages the number of beans held in persistence context
  • No filterMany expression support (predicates on a ToMany path rather than root)
  • No partial object support (need to convert over to DTO queries instead)

Extra notes:

List vs Set: This is a Hibernate specific implementation design where Hibernate gives Set "bag semantics" (better sql implementation). With Ebean we can equally use Set or List and recommend List due to it avoiding issues related equals/hashcode on mutating beans. De-duplication when converting relations into objects is the job of the persistence context and applies equally to List and Set with Ebean.

Ebean has a different architecture wrt dirty values meaning entity bean queries are pretty close to the cost of DTO queries. Hibernate doesn't yet support partial objects and has much higher costs for storing "old values" which means Hibernate folks promote the use of DTO queries for performance reasons. We don't have the same need with Ebean due to our architectural approach (where Ebean stores old values).

LazyInitializationException

This is another Hibernate specific behaviour. Ebean users don't need to deal with this at all. Additionally Ebean doesn't produce N+1 plus Ebean also has query.setDisableLazyLoading(true) if we want to stop lazy loading being invoked by mapping code. These are 3 things you'll need to deal with if you use Hibernate.

Hibernate is "mature and powerful"

Yes but it does currently have a different view of what ORM means and maybe always will. Specifically around support for partial objects and complex queries but you could also include sql2011 history support and soft delete support.

Ebean has been open source since 2006 (so 15 years and counting). You can also compare Ebean github issues to Hibernate JIRA issues. There are a number of different ways to view "mature" etc. As I see it, for Hibernate to get to where Ebean is at wrt partial objects and complex queries they have some work to do.

Rob Bygrave
  • 3,861
  • 28
  • 28
0

Big entity graphs are in my experience (mostly worked on web apps where users can't digest big amounts of data) rather rare, but most of the time you can configure a proper batch size or use @Fetch(SUBSELECT) to improve performance when selecting multiple collections. The problem with List vs Set is specifically about the fact that a list could allow duplicates and is unordered i.e. you can't differentiate between the first and the second duplicate. When you join fetch a bag and then join fetch another bag, you get on the JDBC result set level a combination of rows from the two bags such that you can't differentiate objects anymore which could lead to wrong cardinalities. To solve that, you can either use a Set to ensure there can be no duplicates or define an index column @OrderColumn which allows to differentiate the duplicates.

Apart from all this, I think this is a perfect use case for Blaze-Persistence Entity Views and its MULTISET fetch strategy which is like a hybrid of join fetching and subselect fetching that is very efficient.

I created the library to allow easy mapping between JPA models and custom interface or abstract class defined models, something like Spring Data Projections on steroids. The idea is that you define your target structure(domain model) the way you like and map attributes(getters) via JPQL expressions to the entity model.

A DTO model for your use case could look like the following with Blaze-Persistence Entity-Views:

@EntityView(A.class)
public interface ADto {
    @IdMapping
    Long getId();
    String getName();
    @Mapping(fetch = MULTISET)
    List<BDto> getBs();
    @Mapping(fetch = MULTISET)
    List<CDto> getCs();

    @EntityView(B.class)
    interface BDto {
        @IdMapping
        Long getId();
        String getName();
    }
    @EntityView(C.class)
    interface CDto {
        @IdMapping
        Long getId();
        String getName();
    }
}

Querying is a matter of applying the entity view to a query, the simplest being just a query by id.

ADto a = entityViewManager.find(entityManager, ADto.class, id);

The Spring Data integration allows you to use it almost like Spring Data Projections: https://persistence.blazebit.com/documentation/entity-view/manual/en_US/index.html#spring-data-features

Page<ADto> findAll(Pageable pageable);

The best part is, it will only fetch the state that is actually necessary!

Christian Beikov
  • 15,141
  • 2
  • 32
  • 58
  • Thank you for your answer. A few questions: 1) Is it implied, that for each entity graph I am required to build corresponding DTO graph? If I would not like to fetch "bs", but only fetch "cs" - could I reuse existing DTO classes? 2) Are multiselect queries humanly readable? If whole graph (especially quite deep) is contained in single query - that does not sound readable. 3) Are big entity graphs that rare? I cound not image our product without them, it may be design problem, but i thought at least it is quite usual to have at least 2 OneToMany relations. – CrimsonAndRed Feb 25 '21 at 11:00
  • 1) Since Blaze-Persistence Entity-Views allow the use of interfaces which supports multiple inheritance, you can get quite a good reuse, but yes, the idea is, that for every use case, you define the graph of data that you need through a java type. Believe me, the fact that methods aren't there in those DTOs will save you from LazyInitializationExceptions. 2) They are readable. The SQL for the fetch just moves from the main query to a subquery. 3) "Big" is subjective, but ask yourself, how much data can you expect a user to digest at once. Mostly, you don't need all the data at once. – Christian Beikov Feb 25 '21 at 11:08