0

There are several messaging standards that are push based (e.g., MQTT, STORM). Are there any messaging protocols that are pull based?

Use-case:

Company A currently offers a MQTT endpoint that supports several topics. When an event is published to a topic, the MQTT server pushes the event to all the subscribers.

However, there are some shortcomings that company A would like to address:

  • When an new subscriber registers to receive events published to a topic, she would like to receive all the messages ever published to that topic.
  • Whenever there is a "hot" topic, i.e., lot of events are published in a short period of time to that topic, dispatching those events to the registered subscribers might overwhelm the subscribers. And so, subscribers would like to pull events from the topic at their own pace.

Current Solution:

Company A exposed a HTTP endpoint, that has the get-event operation which retrieves the events (e.g., numbered 1034567 to 1034578 published to the the topic named some-hot-topic in the curl invocation below):

curl http://pull.company-a.com/get-events?topic=some-hot-topic&start-at=1034567&stop-at=1034578

The Question:

Instead a building this one off solution, where Company A has a to define the format of the URL, and the format of the response payload, can Company A use any standards that already define these to solve the problem (the standard does not have to be HTTP based).

Couple of things come to mind (e.g., Kafka Consumer REST API, RSS) that address a similar problem in a different context but nothing seems to be defined with the intent of serving the purpose of a standard for a pull based event notification protocol.

Raghu Dodda
  • 1,505
  • 1
  • 21
  • 28

1 Answers1

0

I have some thoughts how it can be achieved with Kafka. I do not have any experience with MQTT so maybe the same can be achieved there.

  1. Kafka consumer have an offset - it controls the current position from which consumer receives new messages from the queue. As a consumer in the group reads messages from the partitions assigned by the coordinator, it must commit the offsets corresponding to the messages it has read. So from my understanding, you new consumers can start read from the very beginning hence receive all the messages. You need to read more about Kafka partitions, consumers group and offset to better understand it.
  2. I guess you can just stop consuming new events if let's say you consumed 100 in a short period of time. So you will have some time to process it. And then you can restore your consumer.

More info about Kafka consumers https://docs.confluent.io/current/clients/consumer.html

Also some good discussion about Kafka offset in SO https://stackoverflow.com/a/32392174/574475

Rubycon
  • 18,156
  • 10
  • 49
  • 70
  • The question is not if it can be done with Kafka. But, if there is a standard (perhaps a REST API) all Kafka brokers must implement in order to be consumed? They do have one, but it does not seem to have gone through any standardization. – Raghu Dodda Aug 16 '18 at 21:20
  • No, there is no any official Kafka HTTP API. Kafka itself provides its own protocol for publishers/consumers - this is the main idea of Kafka message queue. – Rubycon Aug 17 '18 at 08:01