I have a Kafka Cluster in a data center. A bunch of clients that may communicate across WANs (even the internet) will send/receive real time messages to/from the cluster.
I read from Kafka's Documentation:
...It is possible to read from or write to a remote Kafka cluster over the WAN though TCP tuning will be necessary for high-latency links.
It is generally not advisable to run a single Kafka cluster that spans multiple datacenters as this will incur very high replication latency both for Kafka writes and Zookeeper writes and neither Kafka nor Zookeeper will remain available if the network partitions.
From what I understand here and here:
- Producing over a WAN doesn't require ZK and is okay, just mind tweaks to TCP for high latency connections. Great! Check.
- The High Level consumer APIs require ZK connections.
Aren't then clients reading/writing to Kafka over a WAN subject to the same limitations for clusters in bold above?