5

I am trying to secure my HDP2 Hadoop cluster using Kerberos.

So far Hdfs, Hive, Hbase, Hue Beeswax and Hue Job/task browsers are working properly ; however Hue's File Browser is not working, it answers :

WebHdfsException at /filebrowser/
AccessControlException: SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS] (error 500)
Request Method: GET
Request URL:    http://bt1svlmy:8000/filebrowser/
Django Version: 1.2.3
Exception Type: WebHdfsException
Exception Value:    
AccessControlException: SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS] (error 500)
Exception Location: /usr/lib/hue/desktop/libs/hadoop/src/hadoop/fs/webhdfs.py in _stats, line 208
Python Executable:  /usr/bin/python2.6
Python Version: 2.6.6
(...)

My hue.inifile is configured with all security_enabled=true and other related parameters set.


I believe the problem is with WebHDFS.

I tried the curl commands given at http://hadoop.apache.org/docs/r1.0.4/webhdfs.html#Authentication

curl -i --negotiate -L -u : "http://172.19.115.50:14000/webhdfs/v1/filetoread?op=OPEN"

answers :

HTTP/1.1 403 Forbidden
Server: Apache-Coyote/1.1
Set-Cookie: hadoop.auth=; Path=/; Expires=Thu, 01-Jan-1970 00:00:00 GMT; HttpOnly
Content-Type: text/html;charset=utf-8
Content-Length: 1027
Date: Wed, 08 Oct 2014 06:55:51 GMT

<html><head><title>Apache Tomcat/6.0.37 - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> </head><body><h1>HTTP Status 403 - Anonymous requests are disallowed</h1><HR size="1" noshade="noshade"><p><b>type</b> Status report</p><p><b>message</b> <u>Anonymous requests are disallowed</u></p><p><b>description</b> <u>Access to the specified resource has been forbidden.</u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/6.0.37</h3></body></html>

And I could reproduce Hue's error message by adding a user with the following curl request:

curl --negotiate -i -L -u: "http://172.19.115.50:14000/webhdfs/v1/filetoread?op=OPEN&user.name=theuser"

it answers :

HTTP/1.1 500 Internal Server Error
Server: Apache-Coyote/1.1
Set-Cookie: hadoop.auth=u=theuser&p=theuser&t=simple&e=1412735529027&s=rQAfgMdExsQjx6N8cQ10JKWb2kM=; Path=/; Expires=Wed, 08-Oct-2014 02:32:09 GMT; HttpOnly
Content-Type: application/json
Transfer-Encoding: chunked
Date: Tue, 07 Oct 2014 16:32:09 GMT
Connection: close

{"RemoteException":{"message":"SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]","exception":"AccessControlException","javaClassName":"org.apache.hadoop.security.AccessControlException"}}

It seems that there is no Kerberos negotiation between WebHDFS and curl.

I was expecting something like :

HTTP/1.1 401 UnauthorizedContent-Type: text/html; charset=utf-8
WWW-Authenticate: Negotiate
Content-Length: 0
Server: Jetty(6.1.26)
HTTP/1.1 307 TEMPORARY_REDIRECT
Content-Type: application/octet-stream
Expires: Thu, 01-Jan-1970 00:00:00 GMT
Set-Cookie: hadoop.auth="u=exampleuser&p=exampleuser@MYCOMPANY.COM&t=kerberos&e=1375144834763&s=iY52iRvjuuoZ5iYG8G5g12O2Vwo=";Path=/
Location: http://hadoopnamenode.mycompany.com:1006/webhdfs/v1/user/release/docexample/test.txt?op=OPEN&delegation=JAAHcmVsZWFzZQdyZWxlYXNlAIoBQCrfpdGKAUBO7CnRju3TbBSlID_osB658jfGfRpEt8-u9WHymRJXRUJIREZTIGRlbGVnYXRpb24SMTAuMjAuMTAwLjkxOjUwMDcw&offset=0
Content-Length: 0
Server: Jetty(6.1.26)
HTTP/1.1 200 OK
Content-Type: application/octet-stream
Content-Length: 16
Server: Jetty(6.1.26)
A|1|2|3
B|4|5|6

Any idea what could have gone wrong ?

I do have in my hdfs-site.xml on every node :

<property>
  <name>dfs.webhdfs.enabled</name>
  <value>true</value>
</property>

<property>
  <name>dfs.web.authentication.kerberos.principal</name>
  <value>HTTP/_HOST@MY-REALM.COM</value>
</property>

<property>
  <name>dfs.web.authentication.kerberos.keytab</name>
  <value>/etc/hadoop/conf/HTTP.keytab</value> <!-- path to the HTTP keytab -->
</property>
Arnaud
  • 273
  • 1
  • 3
  • 15
  • Hi Arnaud, Thats cool; I am trying to secure hadoop environment using Kerberos. Can you please tell me the process / way you have choosen in detail? – Dinesh Kumar P Nov 24 '14 at 11:45
  • 1
    Well, that would be a long answer, too long for this forum. In a nutshell, you have to install a KDC somewhere, create hadoop system superusers & credentials, and think about the interconnection of your KDC with your legacy enterprise LDAP system for user identification. I used Ambari + some hand-made shell scripts to actually configure the hadoop cluster ; http://ambari.apache.org/current/installing-hadoop-using-ambari/content/ambari-kerb.html give you details. – Arnaud Nov 25 '14 at 12:56
  • Ok Arnaud, Let me take a look; Am going with a working hadoop cluster in 2 ubuntu machines and start with Ambari installation. – Dinesh Kumar P Nov 26 '14 at 09:06

1 Answers1

3

Looks like you do not access WebHDFS (default port = 50070) but HttpFS (default port = 14000), which is a "plain" webapp that is not secured the same way.

A WebHDFS url is often something like http://namenode:50070/webhdfs/v1 ; try to modify hue.ini with that parameter (WebHDFS is recommended over HttpFS)

BillLoot
  • 78
  • 1
  • 7
  • 1
    However, I have an active/standby configuration, and I'm not able to determine for sure which one is the namenode... How to handle that ? – Arnaud Oct 08 '14 at 07:39
  • 1
    From Cloudera doc : "Both WebHDFS and HttpFS use the HTTP REST API so they are fully interoperable, but Hue must be configured to use one or the other. For HDFS HA deployments, you must use HttpFS." (http://www.cloudera.com/content/cloudera/en/documentation/cdh4/v4-2-0/CDH4-Installation-Guide/cdh4ig_topic_15_4.html) – Arnaud Oct 08 '14 at 08:03