0
1. cron job started
2. create Entity1 and save to DB 
3. Fetch transactionEntity from DB
4. using transactions as transactionIds.
    for (Transaction id : transactionIds) {
        a. create Entity2 and save to db
        b. fetch paymentEntity from DB.
        c. response =  post request Rest API call
        d. udpate Entity2 with response
    }
5. udpate Entity1.

Problem statement - I am getting 5000+ transaction from db in transactionIds using cron jobs which need to process as given above. With the above approach while my previous loop is going on, next 5000+ transactions come in the loop as cron job runs in 2 minutes. I have checked multiple solutions(.parallelStream() with ForkJoinPool / ListenableFuture, but am unable to decide which is the best solution to scale the above code. Can I use spring batch for this, if yes, how to do this? What are the steps comes in reader, process and writer from above steps.

Rahul Jaiman
  • 41
  • 1
  • 10
  • Instead of trying to process 5000 transactions in 2 mins, change the design in such a way that in case you start getting 10000 transactions, you system will still handle them. A better solution could be keeping all the transactionIds in some kind of queue and keep reading from queue & process. Cron job will keep adding transactionIds to this queue. – Smile Jan 14 '22 at 04:41
  • @Smile - What are all the options for the queue, as this service is running on multiple pods. and actually, these are wallet failed transactions. Need to process as early as possible. – Rahul Jaiman Jan 14 '22 at 04:58
  • You can look at kafka or if you are deploying on cloud, cloud providers have their own implantations like AWS SQS, etc – Smile Jan 14 '22 at 08:19
  • Need to solve using java only don't have other options. – Rahul Jaiman Jan 14 '22 at 17:48
  • Could you please post which approach you have followed – Naveen Kumar Dasari Dec 29 '22 at 11:54

1 Answers1

0

One way to approach this problem will be to use Kafka for consuming the messages. You can increase the number of pods (hopefully you are using microservices) and each pod can be part of a consumer group. This will effectively remove the loop in your code and consumers can be increased on demand to process any scale.

Another advantage of message based approach will be that you can have multiple delivery modes(at least once, at most once etc) and there are a lot of open source libraries available to view the stats of the topic (Lag between consumption and production of messages in a topic).

If this is not possible,

  1. The rest call should not happen for every transaction, you'll need to post the transactions as a batch. API calls are always expensive to do, so the lesser roundtrips will give you a huge difference in time taken to complete the loop.
  2. Instead of directly updating DB before and after API call, you can change the loop use repository.saveAll(yourentitycollection) // Only one DB call after looping, can be batched
  3. Suggest you to move to producer-consumer strategy in near future.
Mohamed Anees A
  • 4,119
  • 1
  • 22
  • 35