I use JPA (through Spring Data) with a postgress DB. I have a big entity, very central to my system. The entity has multiple associations (OneToMany, ElementCollection etc).
My typical usecase is returing a paginated list of those entities from my API. The API model needs information from the associated fields, so I must somehow fetch them for each entry of the current page.
Depending on the DB query, I use two ways to fetch the original list:
- Typical JPQL query from my spring repository (language is kotlin):
@Query("select c from Company c where c.state = :state and :country member of c.targetCountries")
fun findAllByCountryAndState(country: Country, state: CompanyState, page: Pageable): Page<Company>
- For more complex cases I use Criteria Query by extending
JpaSpecificationExecutor
.
My Question is what is the best way to combine pagination with fetching associated fields? The usecase seems pretty standard to me, but trying to implement this sends me from one problem to the next.
What I have tried so far:
Carefully choose which associated fields are commonly needed in lists and mark them as eagerly fetched. The pagination works fine here, but unfortunately is that I run into N+1 problem. I could see tones of SQL queries being generated, one for each entity and associated field. So I followed the advice here and this leads me to:
I used
@NamedEntityGraph
orjoin fetch
s to fetch the needed associations from the very start. This worked great, but compromised my pagination. I got the following warning:HHH000104: firstResult/maxResults specified with collection fetch
. This seemed not so optimal to me, so I tried to solve this using this solution. This mentions to either generate some massive Native queries for each usecase (a maintenance horror which I cannot afford), or to:use a 2-queries approach. First fetch the ids only (pagination, no join fetch) an then fetch the whole entities (join fetch, no pagination). This was a dead-end for me because
JpaSpecificationExecutor
does not support selecting of ids only). And I often need the flexibility of Criteria API where clauses. So I though of a slight variation of this and got:Fetch the whole entity in the first query without associations, then fetch the associated fields only in a seconadary query. "Glue" them together using the id and deliver the final result. The problem is that fetching the associated field only was not possible. I have to fetch the whole entity a second time, which seems like a big overhead, especially considering that the entity is pretty large.
I also tried the
select new MyFieldsOnlyPlease(c.id, MORE_FIELDS) from Company c join fetch c.SOME_ASSOCIATION
approach, in which case I get the following error:query specified join fetching, but the owner of the fetched association was not present in the select list
.
So my current solution is number 4, which seems clearly sub-optimal. To summarize: what is the best practices / best tradeoffs for pagination + associations? This usecase seems pretty standard to me, so I imagine there should be some good/elegant solution that I cannot see.