1

I have following use case,
I wanted to connect a remote hadoop cluster. So, I got all the hadoop conf files (coresite.xml, hdfs-site.xml and others) and stored it in one directory in local file system. I got the correct keytab and krb5.conf file for kerberos authentication. I installed hadoop and placed the untar files under some directory, say /User/xyz/hadoop. I set the following env variables: JAVA_HOME(), HADOOP_HOME, HADOOP_CONF_DIR and finally placed my krb5.conf file under /etc/. This setup helped me to successfully authenticate using kinit -kt <keytab> <principal user> and perform hadoop commands like hadoop fs -ls / from my local terminal and access the cluster.

However, I wanted to perform the same action without downloading hadoop. Is there a way? I am using python and came across this hdfs python library. However, I had hard time understanding and working with this lib.

  1. What I am trying to achieve, is it possible?
  2. If so, what is the right way?
  3. Can someone guide me to setup hdfscli lib with right configuration?
Neil
  • 11
  • 2

0 Answers0