1

Requirement: The nameservice for a Hadoop Namenode HA should be discoverable across clusters.

Solution #1: One solution I found online is to add the nameservice configurations to all the hdfs-site.xml files in the clusters involved.

Problem: We have 10 clusters and growing, we cannot add new cluster definitions in all the clusters everytime a new cluster is deployed.

Pros: Manageble for few clusters. Cons: Not scalable to more clusters.

Solution #2: We are planning working on a second solution to have a central service to resolve the name service across clusters and have a custom class extending org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider class.

Please provide any inputs on any better solutions or any existing patches that are already filed for this issue.

  • Have you seen Twitter’s approach: https://blog.twitter.com/2015/hadoop-filesystem-at-twitter + http://www.slideshare.net/gerashegalov/t-235pvijaya-renuv2 - have a wrapper for multiple clusters? Another way could be to use `xi:include` and fetch configuration from central reliable source (zookeeper?). – rav May 15 '16 at 19:13
  • Thanks for the information. I read the blog from twitter. Please correct me if my understanding is wrong. I believe the solution discussed in the blog applies only to federated clusters.. Correct? What if I don't have a federated cluster and more over it was mentioned that twitter has their own wrapper around hadoop which loads the configs dynamically, which means we have to reconfigure all our clusters. – Mahi jupalli May 17 '16 at 06:05

0 Answers0