0

I'm trying to process a large dataset(do some operation on each element and then save the result to DB).

I'd like to do that in parallel so that it is a little bit faster. Also in case any error ocurrs I'd like to keep the already processed data(hence the propagation = REQUIRES_NEW)

My problem: The data isn't processed in parallel but sequentially(I can see this by logging in each iteration). However if i remove the @Transactional from my service method it is executed in parallel.

Is there any way to execute in parallel but also keep the @Transactional annotation?

@Configuration
public class MyConfig{

   @Bean
   public SmartInitializingSingleton doStuffOnStartUp() {
      List<Long> listOfIds = ... 
      listOfIds.stream().parallel().forEach(id -> service.execute(id));
      return () -> logger.info("This was from the smart initializing bean");
   }
}
@Service
public class MyService{

   @Transactional(propagation = REQUIRES_NEW)
   public void processAndSaveToDB(Long id) {
       Object result = ...//do some time-consuming operation
       objectMapper.save(result); // a MyBatis mapper
   }
}

So let's say the list contains 1000 elements and 1 iteration takes 10 seconds - in total it executes in 10000 seconds. I'd like to run it on multiple threads to shorten total execution time.

Domin0
  • 197
  • 2
  • 11

1 Answers1

0

No, you cannot achieve both, parallel execution but "keep the already processed data". If you have a look at the accepted answer to the question "How do ACID and database transactions work?" you get an explanation:

The I in ACID stands for isolation, that means:

If two transactions are executing concurrently, each one will see the world as if they were executing sequentially,

Therefore you can put all your operations into one big transaction, then they can be processed in parallel, but the results are rolled back to before the operation if an exception happens and execution moves outside the transaction bracket.

You could handle possible errors yourself to keep the rest of the results, but then you have to take care on your own that just valid operations are persisted in the database. If you want to rely on the databases rollback mechanism, you have to do it one operation after another.

Addition: This is not specific for Spring or the @Transactional annotation, but cause by the ACID principle of transactions.

cyberbrain
  • 3,433
  • 1
  • 12
  • 22
  • I know each transaction is separate from the other, but i would like to see each iteration of my loop be exected in a separate thread. Sure, concurrent threads will not see each other but that's not an issue. All I want to achieve is a quicker execution time. In my case there are no transactions being executed concrrently, but sequentially. – Domin0 Aug 06 '23 at 17:36
  • In case I was not clear enough: even if you execute each loop iteration in parallel, the database access will serialize them for the ACID transactional safety. – cyberbrain Aug 07 '23 at 06:08