1

I am trying to find some docs / description of the approach on the subject, please help. I have Hadoop 2.2.0 from Hortonworks installed with some existing Hive tables I need to query. Hive SQL works extremly and unreasonably slow on single node and cluster as well. I hope Shark will work faster.

From Spark/Shark docs I can not figure out how to make Shark work with existing Hive tables. Any ideas how to achieve this? Thanks!

Jigar Parekh
  • 6,163
  • 7
  • 44
  • 64
DarqMoth
  • 603
  • 1
  • 13
  • 31

1 Answers1

0

You need to configure the metastore within the shark-specific hive directory. Details are provided at a similar question I answered here.

In summary, you will need to copy the hive-default.xml to hive-site.xml . Then ensure the metastore properties are set.

Here is the basic info in hive-site.xml

<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://myhost/metastore</value>
  <description>the URL of the MySQL database</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.jdbc.Driver</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>hive</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>mypassword</value>
</property>

You can get more details here: configuring hive metastore

Community
  • 1
  • 1
WestCoastProjects
  • 58,982
  • 91
  • 316
  • 560