I am using RxJava to asynchronously handle servlet requests. During each request, a number of async calls to a remote service API are made using the flatMap operator.
Due to resource constraints, I need to limit the total number of concurrent requests against that API. It would be trivial for a single Rx stream that calls the API in a single flatMap, using its concurrency parameter. But I have multiple independent streams in my application (basically one for each ServletRequest) and each stream has multiple flatMap calls to the API.
So I think I would have to funnel all my remote requests into a singleton stream that performs the actual calls and can then easily limit the concurrency. But then it seems not trivial to get the API responses back into the original stream. Additionally it seems complicated to maintain backpressure across such a construct.
The other option would be to use a traditional semaphore, but I'm not sure if its blocking behaviour is a good fit with Rx.
So is there an established pattern to implement something like this? Or am I missing a clever combination of operators that avoids these complications altogether?