1

I am reading a book on Vert.x. At one point it compares 2 code snippets, one using Vert.x's Future and other one using Vert.x's RxJava API. Code basically fetches data from a database and processes result-set to map rows to objects. Comparing 2 approaches, it states:

However, while Futures make the code a bit more declarative, we are retrieving all the rows in one batch and processing them. This result can be huge and take a lot of time to be retrieved. At the same time, you don’t need the whole result to start processing it. We can process each row one by one as soon as you have them. Fortunately, Vert.x provides an answer to this development model challenge and offers you a way to implement reactive microservices using a reactive programming development model. Vert.x provides RxJava APIs to:

• Combine and coordinate asynchronous tasks

• React to incoming messages as a stream of input

In addition to improving readability, reactive programming allows you to subscribe to a stream of results and process items as soon as they are available. With Vert.x you can choose the development model you prefer. In this report, we will use both callbacks and RxJava.

I am not sure how RxJava API approach is better here. Database would send all rows once and not one by one(assuming we are fetching data from traditional relational database). So in RxJava based approach too we would have to wait for entire result set to arrive over the network,and then only we can process them.

Mandroid
  • 6,200
  • 12
  • 64
  • 134

1 Answers1

1

I found the book you're referencing, and the RxJava approach uses a different API for fetching rows from the database (SQLConnection.rxQueryStream, instead of SQLConnection.query), which does not fetch all rows at once. It's documented here.

The implementation does rely on the java.sql.ResultSet object returned by the underlying JDBC driver to lazily fetch rows from the DB, however. Each JDBC driver behaves very differently in this regard, so whether or not you actually have a lazily-loading ResultSet depends on which DB you're using, what your fetchSize is set to, etc. This question has some more information on lazily-loaded ResultSets.

dano
  • 91,354
  • 19
  • 222
  • 219
  • Thanks for the input. I read the content in first link. It raises a question. If we fetch results in chunks, won't it degrade the performance in terms of latency. Lower latency is one of the prime objective of reactive systems. So basically we can have entire result set at once, but we decide to fetch it in chunks. When I think of data streams, I envisage a producer which is sending data as and when produced. So fetching data in parts is a requirement in such cases. But in this case we are doing that even when not needed, and in the process degrading the performance of the system. – Mandroid Aug 04 '21 at 16:02
  • 1
    Fetching in chunks is better when you have a large dataset. A large dataset can put your application under memory pressure, leading to increased GC time (or an OutOfMemoryError), so chunks are required in that case. Yes, there is increased latency because you're doing multiple DB roundtrips, but that is the trade-off you have to make to avoid the GC throughput/OOM issues. – dano Aug 04 '21 at 16:07