1

I have been trying to setup a ZooKeeper cluster on the Google Compute Engine and have run into some issues when using the external IPs of the machines. My cluster consists of 3 nodes on their own separate instances on GCE.

Now, when I configure each node to use the external IP of the instance they seem to be unable to communicate with each other.

zoo.cfg

tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
initLimit=5
syncLimit=2
server.1=externalIp1:2888:3888
server.2=externalIp2:2888:3888
server.3=externalIp3:2888:3888

If I configure them with their internal IP, however, everything works perfectly fine. My guess is that when ZooKeeper starts up, it binds itself to the internal IP of the instance regardless of the configurations. Because of this, when each node tries to look for the other 2 using the external IPs that they were configured, they're unable to find them.

So my question is, is there any way to make it so that ZooKeeper uses the external IP of the machine instead of the internal one? I'm relatively new to the Google Cloud Platform and to setting up hardware in general, so I'm not really sure if something like ip forwarding, firewall rules, or something else would achieve what I'm trying to do (assuming it's even possible).

Luis Medina
  • 1,112
  • 1
  • 11
  • 20
  • Can you clarify why you would want them to talk to each other over external IPs? Are the users of this ZK cluster external to GCE? Do you maybe want to have them talk to each other using the internal network, and put an authenticating frontend to forward the connections to one of the Zookeeper instances? – Misha Brukman Dec 22 '14 at 20:48

1 Answers1

1

According to the Zookeeper 3.4.5 docs, you need to specify the following option:

clientPortAddress

New in 3.3.0: the address (ipv4, ipv6 or hostname) to listen for client connections; that is, the address that clients attempt to connect to. This is optional, by default we bind in such a way that any connection to the clientPort for any address/interface/nic on the server will be accepted.

Although it appears that by default, it will bind to all available IPs on the server, so theoretically, it should have worked as you have set it up.

Important note: if Zookeeper instances talk to each other using external IPs rather than internal IPs, you will be charged for data egress whereas if all communication is over internal network (using internal IPs) within the same zone, you won't.

Misha Brukman
  • 12,938
  • 4
  • 61
  • 78
  • 1
    Hi Misha - I must have looked through the entire list of Zookeeper options 10 times but I somehow managed to miss this option. After setting this to the external IP ZooKeeper it still didn't wor. You bring up a good point about egress charges which I agree is an important thing to keep in mind. Because of this I decided to use the instances' hostnames instead of their internal IPs as they both run through the internal network and it's a bit easier to automate it through Ansible this way. So for now I won't worry about getting the external IPs to work. Thank you very much for your help! – Luis Medina Dec 23 '14 at 03:48
  • @MishaBrukman Could you please explain how did this issue got resolved? Following to this how can i get the hostname of a GCP Compute instance? – wandermonk Nov 03 '18 at 20:44
  • @wandermonk – Luis Medina (the OP) is the person who resolved their issue, not me; I'm just trying to help. :-) For more info on GCE VM hostnames and DNS, please see https://cloud.google.com/compute/docs/internal-dns and https://cloud.google.com/compute/docs/storing-retrieving-metadata . – Misha Brukman Nov 04 '18 at 00:17
  • @LuisMedina Could you please explain how did this issue got resolved? Following to this how can i get the hostname of a GCP Compute instance? – wandermonk Nov 04 '18 at 04:32