3

Ok, I've been dealing with this problem for a couple of days and it's driving me nuts. I need to use the Hive database with transactions to perform 'update' and 'delete' operations.

I have installed Hadoop and Hive on my machine in pseudo-distributed mode. I have followed this tutorial for the installation. I'm using Java 1.8.0_31, Hadoop 2.6.0, Hive 1.0.0 and there were also a couple of details I changed, but these shouldn't be relevant.

Now, to start my environment (after a reboot, for example), i run the following:

start-dfs.sh
start-yarn.sh
java -jar /usr/local/derby/lib/derbyrun.jar server start &
hive

And everything seems to work fine. Although the tutorial doesn't mention starting derby, if i don't start it, the metastore isn't available (which seems logical) and hive doesn't start.

From here, i can create tables, show tables, connect with my JDBC client, etc etc, everything works great. Now, i need to enable transactions. Following this link and this link i get to the following command:

hive --hiveconf hive.root.logger=info,console 
    --hiveconf hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager 
    --hiveconf hive.compactor.initiator.on=true 
    --hiveconf hive.compactor.worker.threads=1 
    --hiveconf hive.txn.driver=jdbc:derby://localhost:1527/metastore_db;create=true

Sidenote: I'm changing the command and not hive-site.xml just because it's easier to change between commands when trying what works and what doesn't work instead of repeatedly changing the XML file.

I have also tried changing the driver url to jdbc:derby://localhost:1527/metastore_db;create=true;user=APP;password=mine just in case it was needed, but there's no change. When i issue a command (like show tables), i get an error:

15/03/04 23:26:17 [main]: ERROR metastore.RetryingHMSHandler: 
    MetaException(message:Unable to select from transaction database, 
    java.sql.SQLSyntaxErrorException: Table/View 'TXNS' does not exist.

According to this and one of the previous links, it seems like the hive.in.test property must be set to true. So, my launch command becomes:

hive --hiveconf hive.root.logger=info,console 
    --hiveconf hive.in.test=true 
    --hiveconf hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager 
    --hiveconf hive.compactor.initiator.on=true 
    --hiveconf hive.compactor.worker.threads=1 
    --hiveconf hive.txn.driver=jdbc:derby://localhost:1527/metastore_db;create=true;

With this command, I get a new error:

ERROR metastore.RetryingHMSHandler: java.lang.NullPointerException
    at org.apache.hadoop.hive.metastore.txn.TxnHandler.checkQFileTestHack(TxnHandler.java:1146)

And this error doesn't exist anywhere, i feel like i'm the only person on the internet with it. Anyway, because i couldn't find any solution, I dug into the source code:

private void checkQFileTestHack() {
  boolean hackOn = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_IN_TEST) ||
    HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_IN_TEZ_TEST);

  LOG.info("Before if");
  if (hackOn) {
      LOG.info("Hacking in canned values for transaction manager");
      // Set up the transaction/locking db in the derby metastore
      TxnDbUtil.setConfValues(conf);
      try {
          TxnDbUtil.prepDb();
      } catch (Exception e) {
          // We may have already created the tables and thus don't need to redo it.
          if (!e.getMessage().contains("already exists")) {
              throw new RuntimeException("Unable to set up transaction database for" +
                " testing: " + e.getMessage());
          }
      }
  }
}

Line 1146 is the if (!e.getMessage().contains("already exists")) line, which doesn't seem to make much sense, unless "e" is a null, which is strange. Anyway, I thought i could debug this further by adding a few more logging messages, building the project and replacing the original metastore jar (which is where this TxnHandler class is) which my modified one. For that, i downloaded the source code and followed this to build it. I tried maven2 and it didn't work, because some plug-in only worked with maven3, so I got maven3 from here and built the project.

If i build it with the mvn clean install -Phadoop-2,dist command, not only does it take forever, but it fails during the test phase. Because it doesn't fail on the metastore (on the metastore, it skips 1 test, i'm not sure that's supposed to happen), i thought i could just build it without testing. So, we get to this:

mvn clean install -DskipTests -Phadoop-2,dist
rm /usr/local/hive/lib/hive-metastore-1.0.0.jar
cp packaging/target/apache-hive-1.0.0-bin/apache-hive-1.0.0-bin/lib/hive-metastore-1.0.0.jar /usr/local/hive/lib/

Sidenote: in the interest of time, i also tried the -pl metastore -am arguments, but while maven says that metastore has been built, the jar in the lib folder does not change, so I'm guessing I'm doing something wrong.

Anyway, this should build my modified jar, replace the one in hive and, when i start hive again, it should load mine. However, even after i change the code, the error still shows the same, my new logging info isn't registered, even the error line remains the same. It's like i changed nothing in my new jar.

Its strange, i know maven is compiling my code because it recognizes compile errors and i can see on the jar properties that it's a new file, so why don't the rest of my changes show up? Hive recognizes when I delete the original jar, but when I replace it with my modified version, its like I changed nothing.

Anyway, as you can see, i've had many troubles and i've tried to fix most of them. But now im stuck in this one, without being able to use a damn "delete" command because i cant enable transactions. Can anyone point me in the right direction? Tyvm!

... and sorry for the long post.

Community
  • 1
  • 1
BlueMoon93
  • 2,910
  • 22
  • 39

1 Answers1

2

I followed Srinivas' advice and the error disappeared. I no longer need the "hive.in.test" property set to "true" and everything works fine.

I still dont know why changing the source wouldnt affect the rest of the hive program, but I have transactions working.

Edit: in case the link goes down, here's a quote:

After extracting Hive version, you have to create Hive meta store

sudo apt-get install mysql-server
sudo service mysql start
sudo apt-get install libmysql-java
ln -s /usr/share/java/libmysql-java.jar /usr/lib/hive/lib/libmysql-java.jar
sudo chkconfig mysql on

mysql -u root -p
Enter password:
mysql> CREATE DATABASE metastore;
mysql> USE metastore;
mysql> SOURCE /usr/lib/hive/scripts/metastore/upgrade/mysql/hive-schema-0.12.0.mysql.sql;

mysql> CREATE USER 'hive'@'metastorehost' IDENTIFIED BY 'mypassword';
...
mysql> REVOKE ALL PRIVILEGES, GRANT OPTION FROM 'hive'@'metastorehost';
mysql> GRANT SELECT,INSERT,UPDATE,DELETE,LOCK TABLES,EXECUTE ON metastore.* TO 'hive'@'metastorehost';
mysql> FLUSH PRIVILEGES;
mysql> quit;

Then in hive-site.xml, you need set the new parameters like

javax.jdo.option.ConnectionURL - jdbc:mysql://myhost/metastore
javax.jdo.option.ConnectionDriverName - com.mysql.jdbc.Driver
javax.jdo.option.ConnectionUserName - hive
javax.jdo.option.ConnectionPassword - mypassword
datanucleus.autoCreateSchema - false
datanucleus.fixedDatastore - true
datanucleus.autoStartMechanism - SchemaTable
hive.metastore.uris - thrift://<n.n.n.n>:9083

hive.support.concurrency – true  
hive.enforce.bucketing – true 
hive.exec.dynamic.partition.mode – nonstrict  
hive.txn.manager – org.apache.hadoop.hive.ql.lockmgr.DbTxnManager 
hive.compactor.initiator.on – true  
hive.compactor.worker.threads – 1

Then restart Hive-server and Metastore. Now create one normal table and one external table with orc format and load from normal to orc table. Now you can update and delete records.

BlueMoon93
  • 2,910
  • 22
  • 39
  • Can you please tell me what you have did actually? I am getting **Lock Exception: Error communicating with the metastore**. – Kumar Jun 18 '15 at 10:25
  • Follow the link I put up ( http://mail-archives.apache.org/mod_mbox/hive-user/201503.mbox/%3CCAM52XzLhK5_qOXLBUL-b42L1RA2LrnLXAqM4gxyUrQCCKSwW4w%40mail.gmail.com%3E ). It shows how to create the hive metastore and configure hive. – BlueMoon93 Jun 22 '15 at 16:39