We have a Kafka queue with two consumers, both read from the same partition (fan-out scenario). One of those consumers should be the canary and process 1% of the messages, while the other processes the 99% remaining ones.
The idea is to make the decision based on a property of the message, eg the message ID or timestamp (e.g. mod 100), and accept or drop based on that, just with a reversed logic for canary and non-canary.
Now we are facing the issue of how to do so robustly, e.g. reconfigure percentages while running and avoid loosing messages or processing them twice. It appears this escalates to a distributed consensus problem to keep the decision logic in sync, which we would very much like to avoid, even though we could just use ZooKeeper for that.
Is this a viable strategy, or are there better ways to do this? Possibly one that avoids consensus?
Update: Unfortunately the Kafka Cluster is not under our control, and we cannot make any changes.
Update 2 Latency of messages is not a huge issues, a few hundred 100ms added are okay and won't be noticed.