Scala to connect in HIVE via JDBC - HDP

Question

i'm trying to connect in HIVE (in sandbox of Hortonworks) and i'm receving the message below:

Exception in thread "main" java.sql.SQLException: No suitable driver found for jdbc:hive2://sandbox.hortonworks.com:10000/default

Maven dependencies:

<dependencies>
    <dependency>                                                      
        <groupId>org.apache.spark</groupId>                       
        <artifactId>spark-core_2.10</artifactId>                  
        <version>${spark.version}</version>                       
        <scope>provided</scope>                                   
    </dependency>                                                     
    <dependency>                                                      
        <groupId>org.apache.spark</groupId>                       
        <artifactId>spark-sql_2.10</artifactId>                   
        <version>${spark.version}</version>                       
        <scope>provided</scope>                                   
    </dependency>                                                     
    <dependency>                                                      
        <groupId>org.apache.spark</groupId>                       
        <artifactId>spark-hive_2.10</artifactId>                  
        <version>${spark.version}</version>                       
        <scope>provided</scope>                                   
    </dependency>                                                     
</dependencies>

Code:

   // **** SetMaster is Local only to test *****                                  
    // Set context                                                                
    val sparkConf = new SparkConf().setAppName("process").setMaster("local")      
    val sc = new SparkContext(sparkConf)                                          
    val hiveContext = new HiveContext(sc)                                         

    // Set HDFS                                                                   
    System.setProperty("HADOOP_USER_NAME", "hdfs")                                
    val hdfsconf = SparkHadoopUtil.get.newConfiguration(sc.getConf)               
    hdfsconf.set("fs.defaultFS", "hdfs://sandbox.hortonworks.com:8020")           
    val hdfs = FileSystem.get(hdfsconf)                                           

    // Set Hive Connector                                                         
    val url = "jdbc:hive2://sandbox.hortonworks.com:10000/default"                
    val user = "username"                                                         
    val password = "password"                                                     

    hiveContext.read.format("jdbc").options(Map("url" -> url,                     
    "user" -> user,                                                               
    "password" -> password,                                                       
    "dbtable" -> "tablename")).load()

score 1 · Answer 1 · answered May 10 '16 at 08:20

1

You need to have Hive JDBC driver in your application classpath:

<dependency>                                                      
    <groupId>org.apache.hive</groupId>                       
    <artifactId>hive-jdbc</artifactId>                  
    <version>1.2.1</version>                       
    <scope>provided</scope>                                   
</dependency>

Also, specify driver explicitly in options:

"driver" -> "org.apache.hive.jdbc.HiveDriver"

However, it's better to skip JDBC and use native Spark integration with Hive, since it make possible to use Hive metastore. See http://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables

answered May 10 '16 at 08:20

Vitalii Kotliarenko

2,947
18
26

Good deal, I'm trying to use it, any ideas how to connect in sandobox from my local machine, I mean that I'm working in a local machine and I need to connect in sandobox, if I just open a context it suppose that i'm working in sandbox. – Rodrigo Rondena May 10 '16 at 13:46
I suggest to start from Hive config [hive-site.xml](https://github.com/apache/spark/blob/master/sql/hive/src/test/resources/data/conf/hive-site.xml), assuming you have Hadoop configuration already in place. Also, make sure that your version of Spark built with Hive support (default binary is not) – Vitalii Kotliarenko May 10 '16 at 14:51

Scala to connect in HIVE via JDBC - HDP

1 Answers1