I'm using Spring Data, JPA, and Hibernate to perform a function on each record greater than a given ID.
Here's my DAO:
public interface MyEntityDao extends JpaRepository<MyEntity, Long>, {
@QueryHints(value = @QueryHint(name = org.hibernate.jpa.QueryHints.HINT_FETCH_SIZE, value = "1000"))
Stream<MyEntity> findByIdGreaterThanOrderByIdAsc(Long id);
}
The method gets used like this, and it works:
@Transactional(readOnly = true)
public void printRecordsGreaterThan(Long lastId) {
myEntityDao.findByIdGreaterThanOrderByIdAsc(lastId).forEach((entity) -> {
System.out.println("entity: " entity.getId());
});
}
The issue is when this operation needs to scan a very large range. I monitored it with VisualVM and it's keeping all the records in memory (Tens of Gigs worth of RAM).
Is there any way to have this code release the resources once they're processed rather than keep them in memory?
Thanks in advance!
Solution
Thanks to @julodnik in the comments, invoking clear()
on the entity manager every so often solved the issue.
@PersistenceContext
private EntityManager em;
@Transactional(readOnly = true)
public void printRecordsGreaterThan(Long lastId) {
AtomicLong counter = new AtomicLong();
myEntityDao.findByIdGreaterThanOrderByIdAsc(lastId).forEach((entity) -> {
long count = counter.getAndIncrement();
if (count % 1000 == 0) {
logger.info(String.format("Clearing %s session for result %d", type.toString(), counter.get()));
em.clear();
}
System.out.println("entity: " entity.getId());
});
}