15

My application loads a list of entities that should be processed. This happens in a class that uses a scheduler

@Component
class TaskScheduler {

    @Autowired
    private TaskRepository taskRepository;

    @Autowired
    private HandlingService handlingService;

    @Scheduled(fixedRate = 15000)
    @Transactional
    public void triggerTransactionStatusChangeHandling() {
        taskRepository.findByStatus(Status.OPEN).stream()
                               .forEach(handlingService::handle);
    }
}

In my HandlingService processes each task in issolation using REQUIRES_NEW for propagation level.

@Component
class HandlingService {

    @Transactional(propagation = Propagation.REQUIRES_NEW)
    public void handle(Task task) {
        try {
            processTask(task); // here the actual processing would take place
            task.setStatus(Status.PROCCESED);
        } catch (RuntimeException e) {
            task.setStatus(Status.ERROR);
        }
    }
}

The code works only because i started the parent transaction on TaskScheduler class. If i remove the @Transactional annotation the entities are not managed anymore and the update to the task entity is not propagated to the db.I don't find it natural to make the scheduled method transactional.

From what i see i have two options:

1. Keep code as it is today.

  • Maybe it`s just me and this is a correct aproach.
  • This varianthas the least trips to the database.

2. Remove the @Transactional annotation from the Scheduler, pass the id of the task and reload the task entity in the HandlingService.

@Component
class HandlingService {

    @Autowired
    private TaskRepository taskRepository;

    @Transactional(propagation = Propagation.REQUIRES_NEW)
    public void handle(Long taskId) {
        Task task = taskRepository.findOne(taskId);
        try {
            processTask(task); // here the actual processing would take place
            task.setStatus(Status.PROCCESED);
        } catch (RuntimeException e) {
            task.setStatus(Status.ERROR);
        }
    }
}
  • Has more trips to the database (one extra query/element)
  • Can be executed using @Async

Can you please offer your opinion on which is the correct way of tackling this kind of problems, maybe with another method that i didn't know about?

mvlupan
  • 3,536
  • 3
  • 22
  • 35

3 Answers3

14

If your intention is to process each task in a separate transaction, then your first approach actually does not work because everything is committed at the end of the scheduler transaction.

The reason for that is that in the nested transactions Task instances are basically detached entities (Sessions started in the nested transactions are not aware of those instances). At the end of the scheduler transaction Hibernate performs dirty check on the managed instances and synchronizes changes with the database.

This approach is also very risky, because there may be troubles if you try to access an uninitialized proxy on a Task instance in the nested transaction. And there may be troubles if you change the Task object graph in the nested transaction by adding to it some other entity instance loaded in the nested transaction (because that instance will now be detached when the control returns to the scheduler transaction).

On the other hand, your second approach is correct and straightforward and helps avoid all of the above pitfalls. Only, I would read the ids and commit the transaction (there is no need to keep it suspended while the tasks are being processed). The easiest way to achieve it is to remove the Transactional annotation from the scheduler and make the repository method transactional (if it isn't transactional already).

If (and only if) the performance of the second approach is an issue, as you already mentioned you could go with asynchronous processing or even parallelize the processing to some degree. Also, you may want to take a look at extended sessions (conversations), maybe you could find it suitable for your use case.

Dragan Bozanovic
  • 23,102
  • 5
  • 43
  • 110
  • In this example, will the nested transaction's session entity cache be synced with the outer transaction's session? For example, if the "Task" entity is changed inside the nested transaction, will that change apply on the out transaction's session as well? – froi Apr 19 '16 at 18:49
  • Follow up to my question, as the outer transaction's session gets flushed, does that mean the Task changes will be counted as "stale" changes? – froi Apr 19 '16 at 19:01
4

The current code processes the task in the nested transaction, but updates the status of the task in the outer transaction (because the Task object is managed by the outer transaction). Because these are different transactions, it is possible that one succeeds while the other fails, leaving the database in an inconsistent state. In particular, with this code, completed tasks remain in status open if processing another task throws an exception, or the server is restarted before all tasks have been processed.

As your example shows, passing managed entities to another transaction makes it ambiguous which transaction should update these entities, and is therefore best avoided. Instead, you should be passing ids (or detached entities), and avoid unnecessary nesting of transactions.

meriton
  • 68,356
  • 14
  • 108
  • 175
1

Assuming that processTask(task); is a method in the HandlingService class (same as handle(task) method), then removing @Transactional in HandlingService won't work because of the natural behavior of Spring's dynamic proxy.

Quoting from spring.io forum:

When Spring loads your TestService it wrap it with a proxy. If you call a method of the TestService outside of the TestService, the proxy will be invoke instead and your transaction will be managed correctly. However if you call your transactional method in a method in the same object, you will not invoke the proxy but directly the target of the proxy and you won't execute the code wrapped around your service to manage transaction.

This is one of SO thread about this topic, and Here are some articles about this:

  1. http://tutorials.jenkov.com/java-reflection/dynamic-proxies.html
  2. http://tutorials.jenkov.com/java-persistence/advanced-connection-and-transaction-demarcation-and-propagation.html
  3. http://blog.jhades.org/how-does-spring-transactional-really-work/

If you really don't like adding @Transaction annotation in your @Scheduled method, you could get transaction from EntityManager and manage transaction programmatically, for example:

UserTransaction tx = entityManager.getTransaction();
try {
  processTask(task);
  task.setStatus(Status.PROCCESED);
  tx.commit(); 
} catch (Exception e) {
  tx.rollback();
}

But I doubt that you will take this way (well, I wont). In the end,

Can you please offer your opinion on which is the correct way of tackling this kind of problems

There's no correct way in this case. My personal opinion is, annotation (for example, @Transactional) is just a marker, and you need a annotation processor (spring, in this case) to make @Transactional work. Annotation will have no impact at all without its processor.

I Will more worry about, for example, why I have processTask(task) and task.setStatus(Status.PROCESSED); live outside of processTask(task) if it looks like does the same thing, etc.

HTH.

Community
  • 1
  • 1
xsalefter
  • 640
  • 5
  • 16
  • Even if the `processTask` method is a private method within the `HandlingService` all updates should be propagated to the db, because a new transaction was opened when the `handle` method was called (though a Spring proxy). In the second variant I proposed removing the `@Transctional` from the scheduler and pass id`s to the handling service (which will open transactions for each processed task) – mvlupan Apr 10 '16 at 07:31