0

We have a remote hadoop cluster running on RHEL, and we need to access HDFS files from a windows Desktop. so I have written programs in java to do the same.

The thing is, we earlier did not have Kerberos enables, and so I could connect using the following code

Configuration conf = new Configuration();
conf.set("fs.defaultFS","hdfs://one.hdp:8020");
FileSystem fs = FileSystem.get(conf);
FileStatus[] fsStatus = fs.listStatus(new Path("/"));
for(int i = 0; i < fsStatus.length; i++){
   System.out.println(fsStatus[i].getPath().toString());
}

Now that we have Kerberos code, I followed this site http://henning.kropponline.de/2016/02/14/a-secure-hdfs-client-example/, and created the following based on "Providing Credentials from Login" which uses the GSS-API to do a kinit like this

The Callback Handler :

private static String username = "hdfs-user";
private static char[] password = "hadoop".toCharArray();
public static LoginContext kinit() throws LoginException {
  LoginContext lc = new LoginContext(HdfsMain.class.getSimpleName(), new CallbackHandler() {
  public void handle(Callback[] callbacks) throws IOException, UnsupportedCallbackException {
    for(Callback c : callbacks){
      if(c instanceof NameCallback)
        ((NameCallback) c).setName(username);
      if(c instanceof PasswordCallback)
        ((PasswordCallback) c).setPassword(password);
    }
 }});
 lc.login();
 return lc;
}

HdfsMain.conf :

HdfsMain {
  com.sun.security.auth.module.Krb5LoginModule required client=TRUE;
};

Code to connect :

Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://one.hdp:8020");
conf.set("hadoop.security.authentication", "kerberos");
UserGroupInformation.setConfiguration(conf);

LoginContext lc = kinit();
UserGroupInformation.loginUserFromSubject(lc.getSubject());

FileSystem fs = FileSystem.get(conf);
FileStatus[] fsStatus = fs.listStatus(new Path("/"));

for(int i = 0; i < fsStatus.length; i++){
  System.out.println(fsStatus[i].getPath().toString());
}

Now im getting the following error :

Caused by: KrbException: null (68)
    at sun.security.krb5.KrbAsRep.<init>(KrbAsRep.java:76)
    at sun.security.krb5.KrbAsReqBuilder.send(KrbAsReqBuilder.java:316)
    at sun.security.krb5.KrbAsReqBuilder.action(KrbAsReqBuilder.java:361)
    at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:766)
    ... 15 more
Caused by: KrbException: Identifier doesn't match expected value (906)
    at sun.security.krb5.internal.KDCRep.init(KDCRep.java:140)
    at sun.security.krb5.internal.ASRep.init(ASRep.java:64)
    at sun.security.krb5.internal.ASRep.<init>(ASRep.java:59)
    at sun.security.krb5.KrbAsRep.<init>(KrbAsRep.java:60)
    ... 18 more

And im not able to Login.

NOTE : I do not have a Keytab file to test out that approach.

Any kind of help will be appreciated

abisheksampath
  • 376
  • 8
  • 23
  • *"I do not have a Keytab file "* -- but you can connect to the Edge Node and get a Linux shell, right? Then you can *(a)* get a copy of `/etc/krb5.conf` because you need that configuration to contact the Kerberos service, and *(b)* use `ktutil` command to create a keytab file. – Samson Scharfrichter Sep 18 '16 at 19:29
  • OMG - I just realized that you are using the HDFS Java API to access WebHDFS. Why don't you use a plain HTTP connector and get rid of all the Hadoop libraries???? Cf. http://stackoverflow.com/questions/37459073/accessing-kerberos-secured-webhdfs-without-spnego/37480236#37480236 for the principles plus http://stackoverflow.com/questions/38768664/cannot-connect-to-hive-with-secured-kerberos-i-am-using-usergroupinformation-lo/38786116#38786116 for the details of Kerberos authentication -- then you have to parse the JSON returned by WebHDFS but that's not a big deal. – Samson Scharfrichter Sep 18 '16 at 19:49
  • ...plus http://stackoverflow.com/questions/8935083/java-ssl-connection-using-truststore to tell Java that the SSL certificate of the NameNode can be trusted *(I guess the Hadoop admins can provide the JKS file)* – Samson Scharfrichter Sep 18 '16 at 19:56
  • @SamsonScharfrichter .. Thanks for your replies. Was very helpful. So, I am able to authenticate and connect to HDFS. And also was able to check things like if file/directory exists etc. from a java program. But, When I try reading a file I get this error... **_sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target_** ... I am guessing this is because of SSL certificate issue. It'll be helpful if you can tell what exactly I must ask ? – abisheksampath Sep 30 '16 at 12:49
  • The NameNode stores only the location of the files, not the data. So it does a **HTTP redirect** towards one of the DataNodes. And that's a new connection, with a *different* SSL certificate to check -- if your certificates are signed by a common Certificate Authority then you just need that CA in the JKS; but if the certificates are self-signed then you must add each and every cert in the JKS. – Samson Scharfrichter Sep 30 '16 at 13:34
  • I might have to check that with the admins. Thanks a lot @SamsonScharfrichter :) – abisheksampath Sep 30 '16 at 14:40

0 Answers0