Kafka Connect source and sink connectors provide practically ideal feature set for configuring a data pipeline without writing any code. In my case I wanted to use it for integrating data from several DB servers (producers) located on the public Internet.
However some producers don't have direct access to Kafka brokers as their network/firewall configuration allows traffic to a specific host only (port 443). And unfortunately I cannot really change these settings.
My thought was to use Confluent REST Proxy but I learned that Kafka Connect uses KafkaProducer API so it needs direct access to brokers.
I found a couple possible workarounds but none is perfect:
- SSH Tunnel - as described in: Consume from a Kafka Cluster through SSH Tunnel
- Use REST Proxy but replace Kafka Connect with custom producers, mentioned in How can we configure kafka producer behind a firewall/proxy?
- Use SSHL demultiplexer to route the trafic to broker (but just one broker)
Has anyone faced similar challenge? How did you solve it?