1

I see lot of queries being triggered for findAll() call from JpaRepository. I know the join query has to occur in database if we have such a large amount of data. I am trying to retrieve all Post Object using findAll method.

Entities

@Entity
public class Post {
  @Id
  private String postId;
  private String postName;
  @OneToMany(mappedBy = "Post", cascade = CascadeType.ALL)
  private Collection<PostTag> postTags = new HashSet<PostTag>();
}
@Entity
public class Tag {
  @Id
  private String tagId;
  private String tagName;
  @OneToMany(mappedBy = "tag", cascade = CascadeType.ALL)
  @JsonIgnore
  private Collection<PostTag> postTags = new HashSet<PostTag>();
}
@Entity
public class PostTag {
  @EmbeddedId
  private PostTagId postTagId = new PostTagId();
  
  @ManyToOne
  @MapsId("postId")
  @JoinColumn(name = "post_Id")
  @JsonIgnore
  private Post post;
  
  @ManyToOne
  @MapsId("tagId")
  @JoinColumn(name = "tag_Id")
  private Tag tag;
  
  @OneToMany(mappedBy = "posttag", cascade = CascadeType.ALL)
  @JsonIgnore
  private Set<Items> items= new HashSet<Items>();
}
@Embeddable
public class PostTagId implements Serializable {
  private String postId;
  private String tagId;
  //equals & hashcode ommited
}
public class Items{
  @Id
  private String itemId;
  private String itemName;

  @ManyToOne
  @JoinColumns({@JoinColumn(name = "post_id"), @JoinColumn(name = "tag_id")})
  @JsonBackReference
  @JsonIgnore
  private PostTag postTag;

  @OneToMany(mappedBy = "items", cascade = CascadeType.ALL)
  @JsonIgnore
  private Set<SubItems> subItems= new HashSet<SubItems>();

}
public class SubItems{
  @Id
  private String subItemId;
  private String subItemName;

  @ManyToOne
  @JoinColumns({@JoinColumn(name = "itemId")})
  @JsonBackReference
  @JsonIgnore
  private Items items;

}

Hibernate Query log: This is the first query that is triggered from the JPA layer, as I am searching some tag-based filters using criteria.

select Post0_.postId , Post0_.postName  from Post Post0_ inner join 
PostTag posttag1_ on Post0_.postId=posttag1_.post_id inner join 
Tag Tag2_ on posttag1_.tag_id=Tag2_.tagId left outer join 
PostTag posttag3_ on Post0_.postId=posttag3_.post_id 
where (Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ?) 
and (posttag1_.somefield between ? and ?) order by posttag3_.somefield desc, posttag3_.pubDate desc offset 0 rows fetch next ? rows only

For each post record below hibernate query has been triggered,

Hibernate: select count(Post0_.postId) as col_0_0_ from Post Post0_ inner join PostTag posttag1_ on Post0_.postId=posttag1_.post_id inner join Tag Tag2_ on posttag1_.tag_id=Tag2_.tagId where (Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ? or Tag2_.tagName like ?) and (posttag1_.somefield between ? and ?)
Hibernate: select PostTag0_.post_id as post_id1_5_0_, PostTag0_.tag_id as tag_i2_5_0_, PostTag0_.post_id as post_id1_5_1_, PostTag0_.tag_id as tag_i2_5_1_, PostTag0_.content as content3_5_1_, Tag1_.tagId as tagid1_2_2_, Tag1_.tagName as tagna2_2_2_ from PostTag PostTag0_ inner join Tag Tag1_ on PostTag0_.tag_id=Tag1_.tagId where PostTag0_.post_id=?
Hibernate: select items0_.post_id as post_id3_1_0_, items0_.tag_id as tag_i4_1_0_, items0_.itemId as itemi1_1_0_, items0_.itemId as itemi1_1_1_, items0_.itemName as itemn2_1_1_, items0_.post_id as post_id3_1_1_, items0_.tag_id as tag_i4_1_1_ from itemso items0_ where items0_.post_id=? and items0_.tag_id=?
Hibernate: selectsubitem0_.itemId as itemi3_3_0_,subitem0_.subitemId as subitemi1_3_0_,subitem0_.subitemId as subitemi1_3_1_,subitem0_.itemId as itemi3_3_1_,subitem0_.subitemName as subitemn2_3_1_ fromsubitemosubitem0_ wheresubitem0_.itemId=?

For example, if I have 5 posts which are tagged under 2 tag each posttag has Items & SubItems then the above 4 queries triggered continuously depends on record size.

But is there any way I can get this findall() response optimized to get in a single shot from the database? I know the first select query does works on the database side but in the java layer, I had to map to DTO so is this something unavoidable?

M. Deinum
  • 115,695
  • 22
  • 220
  • 224
Satscreate
  • 495
  • 12
  • 38
  • 2
    This looks like the [N+1 select problem](https://stackoverflow.com/questions/32453989/what-is-the-solution-for-the-n1-issue-in-jpa-and-hibernate). – Thomas Jun 30 '21 at 06:53
  • This is a classic N+1 select problem. Instead of using entities, write a custom JPQL which directly returns your DTO. This will save you on data transferred, memory used and performance as you don't need to map anymore. – M. Deinum Jun 30 '21 at 06:59
  • Another thing is badly written equals and hashcode methods might also incur this (as those are used in sets!). You should write a hashcode/equals without using anything else than the id. (See https://vladmihalcea.com/how-to-implement-equals-and-hashcode-using-the-jpa-entity-identifier/). The same for a `toString` (don't use collections in there, as that might lead to this when debugging!). – M. Deinum Jun 30 '21 at 07:10

2 Answers2

0

First, findAll() is like using a bazooka to shoot a mosquito. Try to write your own query. Second, for every ManyToOne, unless you specify the relationshipo as lazy, Hibernate will load all the relationships record by record.

In your case, I would limit the search of the Post to +-500 records. In which business case do you need all the Posts at once? Try to figure it out on how to limit the response. You can use multiple optimized query. Like getting a collection of Ids of one entity and then a query with "WHERE id IN(id1, id2, ...)" by example. Find what is the best for your use case.

Also, please try to set the ManyToOne relationships as Lazy. This should be the rule, not the exception.

Mannekenpix
  • 744
  • 7
  • 15
  • Setting `ManyToOne` to lazy will only make matters worse. The issue is the `OneToMany` which are probably being used in the mapping to a DTO, leading to queries for each collection being used (as you can see in the query log). The initial query already loads all directly referenced objects (due to them **not** being lazy,) making them lazy would even lead to more select statements as data needs to be loaded lazily when needed. – M. Deinum Jun 30 '21 at 07:06
  • OneToMany are Lazy by default. So they will be loaded only if there is a specific access in the code. What I see in the log is that his query is not optimised. That's why I told him to avoid the findAll() method and create a more specific query. It is the findAll() which trigger the select statements since there is a JOIN on all tables. – Mannekenpix Jun 30 '21 at 07:44
  • No it isn't the find all and I'm not debating/questioning the fact that one should write a custom query, it is the fact that you state the `ManytoOne` must be lazy, that is something I have an issue with as that will only make things worse. – M. Deinum Jun 30 '21 at 08:06
  • I like your line `findAll() is like using a bazooka to shoot a mosquito` ! – Panagiotis Bougioukos Jun 30 '21 at 09:02
0

@OneToMany fetch type default is EAGER, which causes lots of queries in your example. To reduce hibernate query, you can change FetchType of @OneToMany to LAZY. But it will cause another problem. This link will show you 2 ways to work with Hibernate and its performance issues.

Reference link: LazyInitializationException

Random Guy
  • 46
  • 4