4

I'm working on a setup of EC2 machines that has standalone Spark cluster, Hive, Apache Ranger. Hive is integrated to Ranger.

As Ranger doesn't have support for Spark-SQL JDBC (port 10015), i tried this open source project https://github.com/yaooqinn/spark-authorizer for Spark Authorization. But didn't work as it seems to rely on yarn resource manager.

I wanted to know any possible ways to acheive authorization on Spark-sql with Apache Ranger.

We are not using any distributions implemented, so features like SPARK-LLAP in hortonworks is not an option.

I have already tried what is explained in http://mail-archives.apache.org/mod_mbox/ranger-user/201601.mbox/%3CCAC1CY9P7iek6U6VDwLEXvLdCNRTcJzk5UWg3sei1MuUMCGrtWA@mail.gmail.com%3E , but that didn't work either.

Have raised a spark jira last year for this but doesnt seem to have picked up yet. https://issues.apache.org/jira/browse/SPARK-24503

We are using Spark 2.3, Hive 2.3, Ranger 1.0.

Marsi
  • 115
  • 11

1 Answers1

3

Build a simple authentication java application to spark-sql port 10015.

package hive.test;

import java.util.Hashtable;
import javax.security.sasl.AuthenticationException;
import org.apache.hive.service.auth.PasswdAuthenticationProvider;

/*
 javac -cp $HIVE_HOME/lib/hive-service-0.11-mapr.jar SampleAuthenticator.java -d .
 jar cf sampleauth.jar hive
 cp sampleauth.jar $HIVE_HOME/lib/.
*/


public class SampleAuthenticator implements PasswdAuthenticationProvider {

  Hashtable<String, String> store = null;

  public SampleAuthenticator () {
    store = new Hashtable<String, String>();
    store.put("user1", "passwd1");
    store.put("user2", "passwd2");
  }

  @Override
  public void Authenticate(String user, String  password)
      throws AuthenticationException {

    String storedPasswd = store.get(user);

    if (storedPasswd != null && storedPasswd.equals(password))
      return;

    throw new AuthenticationException("SampleAuthenticator: Error validating user");
  }

}

Configure the following properties in the hive-site.xml file on each node where HiveServer2 is installed:

hive.server2.authentication CUSTOM hive.server2.custom.authentication.class The authentication class name.

<property>
<name>hive.server2.authentication</name>
<value>CUSTOM</value>
</property>

<property>
<name>hive.server2.custom.authentication.class</name>
<value>hive.test.SampleAuthenticator</value>
</property>

Then restart Hiveserver2 to apply the changes:

reference: https://mapr.com/docs/52/Hive/HiveServer2-CustomAuth.html

Marsi
  • 115
  • 11
  • Authentication and authorization are not same things. What you show is basic authentication which does not cover the access control needs. – Sanan Guliyev Oct 27 '21 at 16:09