1

I'm using Spring MVC 3 + Tiles for a webapp. I have a slow operation, and I'd like a please wait page.

There are two main approaches to please wait pages, that I know of:

  1. Long-lived requests: render and flush the "please wait" bit of the page, but don't complete the request until the action has finished, at which point you can stream out the rest of the response with some javascript to redirect away or update the page.
  2. Return immediately, and start processing on a background thread. The client polls the server (in javascript, or via page refreshes), and redirects away when the background thread finishes.

(1) is nice as it keeps the action all single-threaded, but doesn't seem possible with Tiles, as each JSP must complete rendering in full before the page is assembled and returned to the client.

So I've started implementing (2). In my implementation, the first request starts the operation on a background thread, using Spring's @Async annotation, which returns a Future<Result>. It then returns a "please wait" page to the user, which refreshes every few seconds.

When the please wait page is refreshed, the controller needs to check on the progress of the background thread. What is the best way of doing this?

  1. If I put the Future object in the Session directly, then the poll request threads can pull it out and check on the thread's progress. However, doesn't this mean my Sessions are not serializable, so my app can't be deployed with more than one web server (without requiring sticky sessions)?
  2. I could put some kind of status flag in the Session, and have the background thread update the Session when it is finished. I'm very concerned that passing an HttpSession object to a non-request thread will result in hard to debug errors. Is this allowed? Can anyone cite any documentation either way? It works fine when the sessions are in-memory, of course, but what if the sessions are stored in a database? What if I have more than one web server?
  3. I could put some kind of status flag in my database, keyed on the session id, or some other aspect of the slow operation. It seems weird to have session data in my domain database, and not in the session, but at least I know the database is thread-safe.
  4. Is there another option I have missed?
Rich
  • 15,048
  • 2
  • 66
  • 119

1 Answers1

1

The Spring MVC part of your question is rather easy, since the problem has nothing to do with Spring MVC. See a possible solution in this answer: https://stackoverflow.com/a/4427922/734687

As you can see in the code, the author is using a tokenService to store the future. The implementation is not included and here the problems begin, as you are already aware of, when you want failover.

  • It is not possible to serialize the future and let it jump to a second server instance. The thread is executed within a certain instance and therefore has to stay there. So session storage is no option.
  • As in the example link you could use a token service. This is normally just a HashMap where you can store your object and access it later again via the token (the String identifier). But again, this works only within the same web application, when the tokenService is a singleton.

The solution is not to save the future, but instead the state of the work (in work, finished, failed with result). Even when the querying session and the executing threads are on different machines, the state should be accessible and serialize able. But how would you do that? This could be implemented by storing it in a database or on the file system (the example above you could check if the zip file is available) or in a key/value store or in a cache or in a common object store (Terracota), ...

In fact, every batch framework (Spring Batch for example) works this way. It stores the current state of the jobs in the database. You are concerned that you mix domain data with operation data. But most applications do. On large applications there is the possibility to use two database instances, operational data and domain data.

So I recommend that you save the state and the result of the work in a database.
Hope that helps.

Community
  • 1
  • 1
ChrLipp
  • 15,526
  • 10
  • 75
  • 107
  • Am I right to summarise your answer as (2) from the first list and (3) from the second list? I have one main issue with having session-scoped data in the database: the database is not automatically session-scoped, so I would need to add a clean-up job to periodically check the database for dead session data and delete it. Do you think (2) from the second list is viable? Do you have evidence as to whether it is safe or not to access Session objects from non-request threads? That would avoid the scoping mismatch. – Rich Mar 13 '12 at 18:37
  • I would stick to 2-3 and not to 2-2. I would not update the session. I do not think this is possible at all. Say session 1 is active on server 1 andere a background thread on server 2 would also like to modify they session. This is not possible. – ChrLipp Mar 13 '12 at 18:54
  • Normally the architecture is like this : perform actions online when the duration is under 3 minutes. If you have a long running action, store the request in a database and let it process from a batch scheduler. The scheduler could run on an additional server. There is no need to store anything in the session. – ChrLipp Mar 13 '12 at 18:56
  • In the example above the token is stored in the view, but you could also store it in the database per user if they are authenticated. In that case just list all batches the user executed with their result. – ChrLipp Mar 13 '12 at 19:01
  • 1) What's the best way to deal with scope mismatch between db and session? Do I just need a clean-up job? 2) Do you have evidence as to whether it is safe or not to access Session objects from non-request threads? – Rich Mar 14 '12 at 10:36
  • 1) Yes, I would use a clean-up job. Users could have access to their results for certain days and then you clean it. Unsuccessful runs can be restarted. 2) I wouldn't do that. I tried to explain why in my first comment. Thanks for accepting this answer! – ChrLipp Mar 14 '12 at 13:00
  • See http://my.safaribooksonline.com/book/programming/java/9781935182955/running-batch-jobs/ch04lev1sec4#X2ludGVybmFsX0ZsYXNoUmVhZGVyP3htbGlkPTk3ODE5MzUxODI5NTUvMTA0 – ChrLipp Mar 15 '12 at 07:34
  • 1
    I'm currently using 2-2 on a project that's about to go live. I know you say that "this is not possible", but it seems to work in practice, and I don't have much appetite for setting up a lot of new infrastructure to co-ordinate this via the database (extra tables, extra logic, a scheduled clean-up job, job-scheduling infrastructure etc. etc.), as we have nearly finished UAT. I'll try to remember to update this question with my findings in a few months. – Rich Mar 29 '12 at 09:09