at the moment I develop a Spring Boot application which mainly pulls product review data from a message queue (~5 concurrent consumer) and stores them to a MySQL DB. Each review can be uniquely identified by its reviewIdentifier (String), which is the primary key and can belong to one or more product (e.g. products with different colors). Here is an excerpt of the data-model:
public class ProductPlacement implements Serializable{
private static final long serialVersionUID = 1L;
@Id
@GeneratedValue(strategy = GenerationType.AUTO)
@Column(name = "product_placement_id")
private long id;
@ManyToMany(fetch = FetchType.LAZY, cascade = CascadeType.ALL, mappedBy="productPlacements")
private Set<CustomerReview> customerReviews;
}
public class CustomerReview implements Serializable{
private static final long serialVersionUID = 1L;
@Id
@Column(name = "customer_review_id")
private String reviewIdentifier;
@ManyToMany(fetch = FetchType.LAZY, cascade = CascadeType.ALL)
@JoinTable(
name = "tb_miner_review_to_product",
joinColumns = @JoinColumn(name = "customer_review_id"),
inverseJoinColumns = @JoinColumn(name = "product_placement_id")
)
private Set<ProductPlacement> productPlacements;
}
One message from the queue contains 1 - 15 reviews and a productPlacementId. Now I want an efficient method to persist the reviews for the product. There are basically two cases which need to be considered for each incomming review:
- The review is not in the database -> insert review with reference to the product that is contained in the message
- The review is already in the database -> just add the product reference to the Set productPlacements of the existing review.
Currently my method for persisting the reviews is not optimal. It looks as follows (uses Spring Data JpaRespoitories):
@Override
@Transactional
public void saveAllReviews(List<CustomerReview> customerReviews, long productPlacementId) {
ProductPlacement placement = productPlacementRepository.findOne(productPlacementId);
for(CustomerReview review: customerReviews){
CustomerReview cr = customerReviewRepository.findOne(review.getReviewIdentifier());
if (cr!=null){
cr.getProductPlacements().add(placement);
customerReviewRepository.saveAndFlush(cr);
}
else{
Set<ProductPlacement> productPlacements = new HashSet<>();
productPlacements.add(placement);
review.setProductPlacements(productPlacements);
cr = review;
customerReviewRepository.saveAndFlush(cr);
}
}
}
Questions:
- I sometimes get constraintViolationExceptions because of violating the unique constraint on the "reviewIndentifier". This is obviously because I (concurrently) look if the review is already present and than insert or update it. How can I avoid that?
- Is it better to use save() or saveAndFlush() in my case. I get ~50-80 reviews per secound. Will hibernate flush automatically if I just use save() or will it result in greatly increased memory usage?
Update to question 1: Would a simple @Lock on my Review-Repository prefent the unique-constraint exception?
@Lock(LockModeType.PESSIMISTIC_WRITE)
CustomerReview findByReviewIdentifier(String reviewIdentifier);
What happens when the findByReviewIdentifier returns null? Can hibernate lock the reviewIdentifier for a potential insert even if the method returns null?
Thank you!