I have a stream of events that could be categorized by types and hourly timestamps. My initial thought was to throw events into different topics(one for one category) in Kafka. However, it could easily end up with up to hundreds of topics. Plus, if they're not cleaned up properly(programed dynamically[1] in my case), the system is likely left with thousands of them. From what I have read[2], that seems to cause a significant overhead in Zookeeper.
My second thought was to stream events to one single topic and create multiple consumers. The downside of it is a waste of bandwidth because every consumer has to walk through all events to look up for ones of its interest.
Another approach is to combine my first and second method and find the tradeoff. I.e. Create one topic with multiple partitions; Some categories of events go into the same partition.
I'd like to know what the sane approach is in this scenario.
--