2

I need to load the Post entities along with the PostVote entity that represents the vote cast by a specific user (The currently logged in user). These are the two entities:

Post

@Entity
public class Post implements Serializable {
    public enum Type {TEXT, IMG}

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    protected Integer id;

    @ManyToOne(fetch = FetchType.LAZY, optional = false)
    @JoinColumn(name = "section_id")
    protected Section section;

    @ManyToOne(fetch = FetchType.LAZY, optional = false)
    @JoinColumn(name = "author_id")
    protected User author;

    @Column(length = 255, nullable = false)
    protected String title;

    @Column(columnDefinition = "TEXT", nullable = false)
    protected String content;

    @Enumerated(EnumType.STRING)
    @Column(nullable = false)
    protected Type type;

    @CreationTimestamp
    @Column(nullable = false, updatable = false, insertable = false)
    protected Instant creationDate;
    
    /*accessor methods*/
}  

PostVote

@Entity
public class PostVote implements Serializable {

    @Embeddable
    public static class Id implements Serializable{

        @Column(name = "user_id", nullable = false)
        protected int userId;

        @Column(name = "post_id", nullable = false)
        protected int postId;

        /* hashcode, equals, getters, 2 args constructor */
    }

    @EmbeddedId
    protected Id id;

    @ManyToOne(optional = false)
    @MapsId("postId")
    protected Post post;

    @ManyToOne(optional = false)
    @MapsId("userId")
    protected User user;

    @Column(nullable = false)
    protected Short vote;

    /* accessor methods */
}

All the associations are unidirectional @*ToOne. The reason I don't use @OneToMany is because the collections are too large and need proper paging before being accessed: not adding the @*ToManyassociation to my entities means preventing anyone from naively doing something like for (PostVote pv : post.getPostVotes()).

For the problem i'm facing right now I've come with various solutions: none of them looks fully convincing to me.


1° solution

I could represent the @OneToMany association as a Map that can only be accessed by key. This way there is no issue caused by iterating over the collection.

@Entity
public class Post implements Serializable {
    [...]

    @OneToMany(mappedBy = "post")
    @MapKeyJoinColumn(name = "user_id", insertable = false, updatable = false, nullable = false)
    protected Map<User, PostVote> votesMap;

    public PostVote getVote(User user){
        return votesMap.get(user);
    }
    
    [...]
}  

This solution looks very cool and close enough to DDD principles (i guess?). However, calling post.getVote(user) on each post would still cause a N+1 selects problem. If there was a way to efficiently prefetch some specific PostVotes for subsequent accesses in the session then it would be great. (Maybe for example calling from Post p left join fetch PostVote pv on p = pv.post and pv.user = :user and then storing the result in the L1 cache. Or maybe something that involves EntityGraph)


2° solution

A simplistic solution could be the following:

public class PostVoteRepository extends AbstractRepository<PostVote, PostVote.Id> {
    public PostVoteRepository() {
        super(PostVote.class);
    }

    public Map<Post, PostVote> findByUser(User user, List<Post> posts){
        return em.createQuery("from PostVote pv where pv.user in :user and pv.post in :posts", PostVote.class)
                .setParameter("user",user)
                .setParameter("posts", posts)
                .getResultList().stream().collect(Collectors.toMap(
                        res -> res.getPost(),
                        res -> res
                ));
    }
}

The service layer takes the responsability of calling both PostRepository#fetchPosts(...) and then PostVoteRepository#findByUser(...), then mixes the results in a DTO to send to the presentation layer above.

This is the solution I'm currently using. However, I don't feel like having a ~50 parameters long in clause might be a good idea. Also, having a separate Repository class for PostVote may be a bit overkill and break the purpose of ORMs.


3° solution

I haven't tested it so it might have an incorrect syntax, but the idea is to wrap the Post and PostVote entity in a VotedPost DTO.

public class VotedPost{
    private Post post;
    private PostVote postVote;

    public VotedPost(Post post, PostVote postVote){
        this.post = post;
        this.postVote = postVote;
    }

    //getters
}  

I obtain the object with a query like this:

select new my.pkg.VotedPost(p, pv) from Post p 
left join fetch PostVote pv on p = pv.post and pv.user = :user  

This gives me more type safeness than the the solutions based on Object[] or Tuple query results. Looks like a better alternative than the solution 2 but adopting the solution 1 in a efficient way would be the best.

What is, generally, the best approach in problems like this? I'm using Hibernate as JPA implementation.

cidra
  • 374
  • 1
  • 4
  • 16

1 Answers1

1

I could imagine the standard bi-directional association using @OneToMany being a maintainable yet performant solution.

To mitigate n+1 selects, one could use e.g.:

  • @EntityGraph, to specify which associated data is to be loaded (e.g. one user with all of it's posts and all associated votes within one single select query)
  • Hibernates @BatchSize, e.g. to load votes for multiple posts at once when iterating over all posts of a user, instead having one query for each collection of votes of each post

When it comes to restricting users to perform accesses in less performant ways, I'd argue that it should be up the API to document possible performance impacts and offer performant alternatives for different use-cases.

(As a user of an API one might always find ways to implement things in the least performant fashion:)

fladdimir
  • 1,230
  • 1
  • 5
  • 12
  • 1
    ``@BatchSize`` is a really nice feature. Knowing that is actually uses ``IN`` under the hood (According to the link you provided) makes it basically solution #2 with the syntax of solution #1, which is awesome. – cidra Feb 02 '22 at 13:02
  • I might consider moving some logic from ``*Repository`` classes to the Entity classes (es. ``getPostsFromSection(Section)`` turns into ``Section#getPosts()``). Would ``BatchSize`` still be a viable option, considering that i need paging? – cidra Feb 02 '22 at 13:06
  • 1
    BatchSize on a collection can load multiple collections (of different parent entities) at once. And in any case, all children in one single collection will be loaded at once, so it BatchSize may not really be comparable to pagination in that sense. – fladdimir Feb 02 '22 at 18:07
  • 1
    (however Im not sure whether i fully understood the specific use case you described in your last comment, maybe its worth to be discussed by a broader audience in a dedicated question) – fladdimir Feb 02 '22 at 18:09