12

I would like to stream results from PostgreSQL 11.2 and not read all results to memory at once. I use the newest stable SpringBoot 2.1.4.RELEASE.

I read the article how to do it in MySQL. http://knes1.github.io/blog/2015/2015-10-19-streaming-mysql-results-using-java8-streams-and-spring-data.html I also read article how to do it in PostgreSQL: Java 8 JPA Repository Stream row-by-row in Postgresql

I have repository like that:

public interface ProductRepository extends JpaRepository<Product, UUID> {
    @Query("SELECT p from Product p")
    @QueryHints(value = @QueryHint(name = HINT_FETCH_SIZE, value = "50"))
    Stream<Product> streamAll();
}

Than I use the stream that way:

  productRepository.streamAll().forEach(product -> export(product));

To make the example easier, 'export' method is completely empty.

When I call the method I see Hibernate query

Hibernate: select product0_.id as id1_0_, product0_.created as created2_0_, product0_.description as descript3_0_, product0_.name as name4_0_, product0_.product_type_id as product_5_0_ from products product0_ order by product0_.id

and after some time I have OutOfMemoryError. The query hint didn't help.

How to read data using Spring Boot repository (or even EntityManager) and load rows from DB in optimal way. I know that I can make pagination, but as in articles was written, it is not the most optimal way.

Hollow.Quincy
  • 547
  • 1
  • 7
  • 14

3 Answers3

2

At the moment using spring all the data are retrieved and the Stream is applied only to data already in memory.

If you look at the source of org.springframework.data.jpa.provider.PersistenceProvider it seems that it uses a ScrollableResults to stream over the data.

Generally a ScrollableResults retrieve all data in memory.

You can find an interesting complete analysis using a MySql database here, but probably the same works for a Postgres database.

So also if you think to use a solution that doesn't need to use a lot memory in reality it does because the underlying implementation is not using an optimal implementation.

Davide Lorenzo MARINO
  • 26,420
  • 4
  • 39
  • 56
  • Probably you didn't read my question carefully, because I added a link to the same article as you. The point of my question is that in this article it is written that I will not read all data to memory, but when I do it for PostgreSQL it reads all. – Hollow.Quincy May 02 '19 at 13:07
2

You must detach the entity after your work finishes.

import javax.persistence.EntityManager;
...
@Autowired
private EntityManager entityManager;
... 
// Your business logic
productRepository.streamAll().forEach(product -> {
   export(product);
   // must detach so that garbage collector can reclaim the memory.
   entityManager.detach(product);
});
gunescelil
  • 313
  • 1
  • 23
0

I faced exactly the same problem, and after long debugging of internals of spring data and hibernate have found solution which worked for me.

So to fetch data using the cursor in PostgreSQL it should be method with Stream result + annotation (kotlin syntax):

@QueryHints(QueryHint(name = org.hibernate.annotations.QueryHints.FETCH_SIZE, value = "50"))

which value it should be 50 or some else - it's not so important. Probably you put the wrong name of the hint.

GreenTea
  • 344
  • 3
  • 9
  • still does not work for me. By the way, Spring's HINT_FETCH_SIZE is an alias to Hibernate's FETCH_SIZE. – spi Jun 22 '22 at 14:19