4

I am trying create a hive sink in my flume configuration and when I run flume-ng I get some dependency problems as shown below. Could you tell me how can I overcome this dependency? It looks there is some runtime dependency. I have installed hive properly and made the required environment variable settings to point things HIVE_HOME. Any help is apreciated. Thanks.

2016-01-15 14:41:37,757 (conf-file-poller-0) 
[INFO -org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:42)]  
Creating instance of sink: hiveSink, type: hive
2016-01-15 14:41:37,763 (conf-file-poller-0) 
[ERROR -      org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunn    able.run(PollingPropertiesFileConfigurationProvider.java:145)] 
Failed to  start  agent because dependencies were not found in classpath. Error follows.

java.lang.NoClassDefFoundError: org/apache/hive/hcatalog/streaming/RecordWriter
at org.apache.flume.sink.hive.HiveSink.createSerializer(HiveSink.java:223)
at org.apache.flume.sink.hive.HiveSink.configure(HiveSink.java:203)
at org.apache.flume.conf.Configurables.configure(Configurables.java:41)
at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:413)
at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:98)
at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: org.apache.hive.hcatalog.streaming.RecordWriter
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 13 more
F. Aydemir
  • 2,665
  • 5
  • 40
  • 60

2 Answers2

3

I am learning Flume on CDH 5.7 distribution. I did face the same issue, and flume-env.sh script didn't seem to work for me, so i made use of --classpath command argument to reference all libraries related to hive and hive-hcatalog. Followed a trial and error approach for adding multiple lib folders references. Did not find much documentation on this argument.

e.g. flume-ng agent --conf /home/cloudera/flume/ --conf-file /home/cloudera/flume/netcat_memchannel_hivesink.conf --name agent1 --classpath "/usr/lib/hive-hcatalog/share/hcatalog/*":"/usr/lib/hive/lib/*"

Dhirendra Khanka
  • 759
  • 1
  • 8
  • 21
1

You should set both HIVE_HOME and HCAT_HOME environment variables by either flume-env.sh or make them available to the user profile.

Dilip Kari
  • 11
  • 1