22

Redis team introduce new Streams data type for Redis 5.0. Since Streams looks like Kafka topics from first view it seems difficult to find real world examples for using it.

In streams intro we have comparison with Kafka streams:

  1. Runtime consumer groups handling. For example, if one of three consumers fails permanently, Redis will continue to serve first and second because now we would have just two logical partitions (consumers).
  2. Redis streams much faster. They stored and operated from memory so this one is as is case.

We have some project with Kafka, RabbitMq and NATS. Now we are deep look into Redis stream to trying using it as "pre kafka cache" and in some case as Kafka/NATS alternative. The most critical point right now is replication:

  1. Store all data in memory with AOF replication.
  2. By default the asynchronous replication will not guarantee that XADD commands or consumer groups state changes are replicated: after a failover something can be missing depending on the ability of followers to receive the data from the master. This one looks like point to kill any interest to try streams in high load.
  3. Redis failover process as operated by Sentinel or Redis Cluster performs only a best effort check to failover to the follower which is the most updated, and under certain specific failures may promote a follower that lacks some data.

And the cap strategy. The real "capped resource" with Redis Streams is memory, so it's not really so important how many items you want to store or which capped strategy you are using. So each time you consumer fails you would get peak memory consumption or message lost with cap.

We use Kafka as RTB bidder frontend which handle ~1,100,000 messages per second with ~120 bytes payload. With Redis we have ~170 mb/sec memory consumption on write and with 512 gb RAM server we have write "reserve" for ~50 minutes of data. So if processing system would be offline for this time we would crash.

Could you please tell more about Redis Streams usage in real world and may be some cases you try to use it themself? Or may be Redis Streams could be used with not big amount of data?

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Nick Bondarenko
  • 6,211
  • 4
  • 35
  • 56
  • I've been playing with Redis Streams, and yes, there are some cases that Kafka/RabbitMQ can be replaced. Especially when performance is more relevant than data persistence. I'm maintaing a repository to implement a bus on NodeJS backed by Redis Streams: [hfxbus](https://github.com/exocet-engineering/hfx-bus) – Victor França Apr 06 '19 at 20:49

1 Answers1

5

This feels like a discussion that belongs in the redis-db mailing list, but the use case sounds fascinating.

Note that Redis Streams are not intended to be a Kafka replacement - they provide different properties and capabilities despite the similarities. You are of course correct with regards to the asynchronous nature of replication. As for scaling the amount of RAM available, you should consider using a cluster and partition your streams across period-based key names.

Asclepius
  • 57,944
  • 17
  • 167
  • 143
Itamar Haber
  • 47,336
  • 7
  • 91
  • 117