9

I have problems with full loading of very complex object from DB in a reasonable time and with reasonable count of queries.

My object has a lot of embedded entities, each entity has references to another entities, another entities references yet another and so on (So, the nesting level is 6)

So, I've created example to demonstrate what I want: https://github.com/gladorange/hibernate-lazy-loading

I have User.

User has @OneToMany collections of favorite Oranges,Apples,Grapevines and Peaches. Each Grapevine has @OneToMany collection of Grapes. Each fruit is another entity with just one String field.

I'm creating user with 30 favorite fruits of each type and each grapevine has 10 grapes. So, totally I have 421 entity in DB - 30*4 fruits, 100*30 grapes and one user.

And what I want: I want to load them using no more than 6 SQL queries. And each query shouldn't produce big result set (big is a result set with more that 200 records for that example).

My ideal solution will be the following:

  • 6 requests. First request returns information about user and size of result set is 1.

  • Second request return information about Apples for this user and size of result set is 30.

  • Third, Fourth and Fifth requests returns the same, as second (with result set size = 30) but for Grapevines, Oranges and Peaches.

  • Sixth request returns Grape for ALL grapevines

This is very simple in SQL world, but I can't achieve such with JPA (Hibernate).

I tried following approaches:

  1. Use fetch join, like from User u join fetch u.oranges .... This is awful. The result set is 30*30*30*30 and execution time is 10 seconds. Number of requests = 3. I tried it without grapes, with grapes you will get x10 size of result set.

  2. Just use lazy loading. This is the best result in this example (with @Fetch= SUBSELECT for grapes). But in that case that I need to manually iterate over each collection of elements. Also, subselect fetch is too global setting, so I would like to have something which could work on query level. Result set and time near ideal. 6 queries and 43 ms.

  3. Loading with entity graph. The same as fetch join but it also make request for every grape to get it grapevine. However, result time is better (6 seconds), but still awful. Number of requests > 30.

  4. I tried to cheat JPA with "manual" loading of entities in separate query. Like:

    SELECT u FROM User where id=1;
    SELECT a FROM Apple where a.user_id=1;
    

This is a little bit worse that lazy loading, since it requires two queries for each collection: first query to manual loading of entities (I have full control over this query, including loading associated entities), second query to lazy-load the same entities by Hibernate itself (This is executed automatically by Hibernate)

Execution time is 52, number of queries = 10 (1 for user, 1 for grape, 4*2 for each fruit collection)

Actually, "manual" solution in combination with SUBSELECT fetch allows me to use "simple" fetch joins to load necessary entities in one query (like @OneToOne entities) So I'm going to use it. But I don't like that I have to perform two queries to load collection.

Any suggestions?

Jeffrey Chung
  • 19,319
  • 8
  • 34
  • 54
EvilOrange
  • 876
  • 1
  • 9
  • 17
  • I'd do it manually in 5 queries: first combined with second using an eager `@OneToMany`, then one per the described request 3-6, then assemble an object in Java code. Not too elegant, of course, but no `@OneToMany` multiplication nor lazy loading would be required. – Roman Puchkovskiy Dec 02 '17 at 15:50
  • @RomanPuchkovskiy yeah, it makes sense. But I also need to have ability to save instance after loading and editing. So, it could work, but sometimes I got exceptions like "Not found exception" during saving (since it tries to search entity in old collection) . So, i decided not to use manualy object construction . But I think, that your advice should be good for read-only operations. – EvilOrange Dec 04 '17 at 11:58

3 Answers3

7

I usually cover 99% of such use cases by using batch fetching for both entities and collections. If you process the fetched entities in the same transaction/session in which you read them, then there is nothing additionally that you need to do, just navigate to the associations needed by the processing logic and the generated queries will be very optimal. If you want to return the fetched entities as detached, then you initialize the associations manually:

User user = entityManager.find(User.class, userId);
Hibernate.initialize(user.getOranges());
Hibernate.initialize(user.getApples());
Hibernate.initialize(user.getGrapevines());
Hibernate.initialize(user.getPeaches());
user.getGrapevines().forEach(grapevine -> Hibernate.initialize(grapevine.getGrapes()));

Note that the last command will not actually execute a query for each grapevine, as multiple grapes collections (up to the specified @BatchSize) are initialized when you initialize the first one. You simply iterate all of them to make sure all are initialized.

This technique resembles your manual approach but is more efficient (queries are not repeated for each collection), and is more readable and maintainable in my opinion (you just call Hibernate.initialize instead of manually writing the same query that Hibernate generates automatically).

Dragan Bozanovic
  • 23,102
  • 5
  • 43
  • 110
4

I'm going to suggest yet another option on how to lazily fetch collections of Grapes in Grapevine:

@OneToMany
@BatchSize(size = 30)
private List<Grape> grapes = new ArrayList<>();

Instead of doing a sub-select this one would use in (?, ?, etc) to fetch many collections of Grapes at once. Instead ? Grapevine IDs will be passed. This is opposed to querying 1 List<Grape> collection at a time.

That's just yet another technique to your arsenal.

Stanislav Bashkyrtsev
  • 14,470
  • 7
  • 42
  • 45
  • Use with caution though, if you have loaded 30 `Gravevine`s, but is only interested in the `Grape`s of one of them, loading those `Grape`s will also result in loading the `Grape`s for the other 29 `Grapevine`s.. If the one you want to look at has 3 `Grapes`, and one of the other has 100,000, you will end up loading a lot more than you actually need. – Tobb Dec 02 '17 at 16:18
0

I do not quite understand your demands here. It seems to me you want Hibernate to do something that it's not designed to do, and when it can't, you want a hack-solution that is far from optimal. Why not loosen the restrictions and get something that works? Why do you even have these restrictions in the first place?

Some general pointers:

  1. When using Hibernate/JPA, you do not control the queries. You are not supposed to either (with a few exceptions). How many queries, the order they are executed in, etc, is pretty much beyond your control. If you want complete control of your queries, just skip JPA and use JDBC instead (Spring JDBC for instance.)
  2. Understanding lazy-loading is key to making decisions in these type of situation. Lazy-loaded relations are not fetched when getting the owning entity, instead Hibernate goes back to the database and gets them when they are actually used. Which means that lazy-loading pays off if you don't use the attribute every time, but has a penalty the times you actually use it. (Fetch join is used for eager-fetching a lazy relation. Not really meant for use with regular load from the database.)
  3. Query optimalization using Hibernate should not be your first line of action. Always start with your database. Is it modelled correctly, with primary keys and foreign keys, normal forms, etc? Do you have search indexes on proper places (typically on foreign keys)?
  4. Testing for performance on a very limited dataset probably won't give the best results. There probably will be overhead with connections, etc, that will be larger than the time spent actually running the queries. Also, there might be random hickups that cost a few milliseconds, which will give a result that might be misleading.
  5. Small tip from looking at your code: Never provide setters for collections in entities. If actually invoked within a transaction, Hibernate will throw an exception.
  6. tryManualLoading probably does more than you think. First, it fetches the user (with lazy loading), then it fetches each of the fruits, then it fetches the fruits again through lazy-loading. (Unless Hibernate understands that the queries will be the same as when lazy loading.)
  7. You don't actually have to loop through the entire collection in order to initiate lazy-loading. You can do this user.getOranges().size(), or Hibernate.initialize(user.getOranges()). For the grapevine you would have to iterate to initialize all the grapes though.

With proper database design, and lazy-loading in the correct places, there shouldn't be a need for anything other than:

em.find(User.class, userId);

And then maybe a join fetch query if a lazy load takes a lot of time.

In my experience, the most important factor for speeding up Hibernate is search indexes in the database.

Tobb
  • 11,850
  • 6
  • 52
  • 77