If internal API calls are longer than the allowable client timeout, then you will need to find a WebSocket or WebSocket-like alternative. The alternative to WebSockets is to use long-polling (good example with source code here), which is essentially having the client repeatedly make requests until the original task is completed and the data can be sent. Either way, you'll need to use a pub/sub mechanism of some sort to handle multiple users, which is where things can get complicated.
Pub/sub can be complicated and I don't have an example on-hand, but essentially you must (1) have the client subscribe to a channel using a unique identifier (you can do this with CometD via a service channel), (2) evaluate the response, (3) publish the response to the client's subscribed channel, and finally (4) close the channel when it's no longer in use.
I've had some luck with CometD as a library to simplify pub/sub channel management, and it provides a good abstraction for asynchronous communication, although I haven't tried it with Spring and it might be heavy for what you want to do.
Another option that I'm less familiar with is to use Spring with STOMP. This seems to come recommended by others. Spring does provide ways to send messages to single users using STOMP: Sending message to specific user on Spring Websocket
I'd recommend following the above STOMP example.
Additional STOMP resources:
As a side note, throttling may be necessary here, too, as with any time that you can spawn long-running threads from clients.
If you choose not to use STOMP or CometD, then a lighter solution could be to roll your own pub/sub for this single case using Spring's DeferredResult
(as in Roger Hughes example), tying the subscription request to the long-poll request via a UUID token, which might also be the session ID if you choose to disallow concurrent requests per user. On subscription, the system can associate the request with a UUID and return this UUID to the client. The client can then make long-poll requests (again, as in Roger Hughes example) but with the UUID attached. The server can wait until the request for the given UUID has completed and then return the result to the client via the client's active long-poll request.
Channel management (i.e., request/UUID-tracking) can be done by clearing the channel on DeferredResult
result retrieval and removing orphan channels with a separate thread acting as a GC -- or, perhaps better yet, automatically clearing orphan channels by removing them on DeferredResult
completion/timeout if no active listeners exist. If you opt for the latter option, you will want to make sure that the client won't have any delay between its long-poll requests so that the DeferredResult
doesn't unintentionally complete with no listeners.