Handling large simultaneous workloads using pub/sub?

Question

I'm working on a problem where large no. of operations have to simultaneously be kicked off based of an event. For example, user types a destination and dates and wants the best offer from over 200 "travel partners".

To satisfy this, I'm planning on an event-driven architecture where upon user providing the appropriate input, a message is published to a topic, and this topic has worker subscribed to it which in turns generates additional events, one for each travel partner to get offers from.

So Essentially:

(1) publish message to Topic "TRAVEL_DESTINATION_REQUEST" upon user input being provided
(2) a worker is subscribed to this topic
(3) worker at (2), For each travel partner in system, publish event with data {date:..., destination:...,travel_partner_id: ...etc} to topic FIND_OFFER.
(4) workers subscribed to FIND_OFFER query travel_partner_id and persist the response somewhere.

So if you have 200 travel partners, above would push 200 events to FIND_OFFER topic for workers to handle per each user query.

Is this how you would go about solving a problem as such? If not how would you go about it? Sequentially is obviously not possible since we can't have the user seat there waiting and travel partner api calls may differ in response times...

In GKE world, is pub/sub a good candidate for such an approach? Does anyone know if pod load-balancing would cause any issues with this model?

https://stackoverflow.com/questions/4607141/how-does-a-site-like-kayak-com-aggregate-content — armandino, Jul 23 '21 at 17:08

score 0 · Answer 1 · answered Jul 22 '21 at 04:53

I'm responding to this because no-one has in the last 2 days - not because I'm an expert in this. So with that in mind...

Definitely keep the user experience in mind. Do you want to provide 200 results back to the user? Not sure I would ever look at 200 results, even if the UX is phenomenally slick.

Basically you need some sort of orchestration to coordinate steps 2, 3 & 4 - not just issuing the requests, but also dealing with the data coming back. A key aspect of this orchestration is to decide what to do in "rainy day" scenarios, specifically those involving errors or delays:

If partner 168 doesn't reply within X seconds, what do you do?
If 199 partners have responded but 168 has not (still within time) do you wait?
If you're about to time-out and you only have 30 responses back, what do you do?
If your previous request to partner 168 timed-out, do you try them again now in this current request? ...Or do you try them in 10 seconds? ...Does the UI care you've only got 199 partners working right now?

If you can map that out in your head (and in a diagram) then that thought process should help you.

Event-centric solutions & tools should be good at helping to coordinate results - e.g. helping you decide when to return stuff back up the chain to the UI. Have a look at event / asynchronous design patterns in general, and if you already have a specific technology in mind see what patterns / reference ideas they have.

score 0 · Answer 2 · answered Jul 28 '21 at 13:54

Disclaimer: I just joined StackOverflow and am a member of Solace, a pioneer in the EDA & Event-enablement space.

This is a classic pubsub problem that is well served using any JMS Brokers, or a Solace or Kafka Broker for better QoS.

Making few assumptions - request is triggered from a UI with the expectation of presenting responses in near real-time as they arrive from the partners. The UI refresh can be left alone at the hands of a good frontend framework/stack of your choice - the crux of the matter is around how this is handled in the backend.

An event-driven design will serve great for this requirement - the flow would look like this:

Publish a request message to topic TRAVEL_DESTINATION_REQUEST with "reply-to" set to a Queue TRAVEL_DESTINATION_RESPONSE
Subscribers (partners) subscribe to the topic TRAVEL_DESTINATION_REQUEST and send their response to the "reply-to" destination
Publisher, parallelly runs a thread (or callback) checking for the arrival of response messages on the TRAVEL_DESTINATION_RESPONSE queue and take appropriate action (push it to the client, persist in a DB, or something like that) ensuring that all responses are processed

Almost any Broker can handle this use case - however, the complexity arises when you want to handle several such requests simultaneously without mixing responses, without proliferation of topics, queues and consuming services, resulting in resource overrun and management overhead.

Here is a possible solution using Solace as the EDA Broker. Solace's TOPIC scheme is unique and is well-suited for this requirement. The topic is not just a name, rather a scheme that can encode dynamic details as levels in the topic name that can be useful while processing the message. Solace topics are hierarchical allowing can use wildcards to filter based on different levels in a topic.

With Solace and its hierarchical topics - we can manage this as follows:

Publish requests on topics TRAVEL_DESTINATION_REQUEST/ and set the reply-to destination as RESPONSE_QUEUE
All the partners subscribe to the topic with wildcard TRAVEL_DESTINATION_REQUEST/* so that they receive all travel request messages
Either the publisher itself or a separate service could connect to the RESPONSE_QUEUE and retrieve the responses

The last step (3) is where the most benefit of topic-hierarchy comes into play. You can create multiple, simultaneous client connections to the queue RESPONSE_QUEUE and have a distinct subscription for each connection - it would be like spawning a consumer service for every single published request-id, which in turn connects to the queue and subscribes for a response topic TRAVEL_DESTINATION_RESPONSE/.

After some time or a logical condition, these consumer services can exit marking the completion of request processing. As to what happens inside this service, it is the business logic - persist into a DB or push it to the frontend or something else.

Hope this lays out an approach using Solace as the Broker for your requirement. I am sure, other options are available and valid, I am just sharing an efficient approach based on Solace Broker.

Handling large simultaneous workloads using pub/sub?

2 Answers2