1

I want to use libhdfs for writing into and reading from HDFS. I have the release version hadoop2.5.0. What I am trying to do is to compile and the run the code they have provided as test, The code compiles fine,here is what I do

gcc -I/usr/lib/jvm/java-7-openjdk-amd64/include test/test_libhdfs_ops.c -o test.o -lhdfs -L .

but whenever I tried to run it, I get the following error,

unable to get stack trace for java.lang.NoClassDefFoundError exception: ExceptionUtils::getStackTrace error.

I realized that it is unable to link to some jar file for the java classes, found similar issues here Writing files in hdfs in C++ (libhdfs) and here Hadoop 2.0 JAR files, tried to resolve them but no success. here's what I have set as my CLASSPATH env variable

CLASSPATH=$HADOOP_HOME/share/hadoop/common/:$HADOOP_HOME/share/hadoop/hdfs/:$HADOOP_HOME/share/hadoop/yarn/:$HADOOP_HOME/share/hadoop/mapreduce/:$HADOOP_HOME/share/hadoop/httpfs/:$HADOOP_HOME/share/hadoop/tools/

What am I missing here?

Community
  • 1
  • 1
code rider
  • 49
  • 8

1 Answers1

0

Well, I found out that the jar files were not getting linked simply by specifying their root directory in the CLASSPATH. I had to explicitly add the path of all the jar files in $HADOOP_HOME/share/hadoop/common/lib and the others present in the hdfs folder, to the CLASSPATH env variable.

Wrote a simple python script to do that and everything works. Here's how the script looks

!/usr/bin/python

import os

PATH = 'path/to/your/jar/files/dir/'

os.environ['CLASSPATH'] = 'temp_path'

listFiles = [] list_file = ''

for root, dirs, file_name in os.walk(PATH): for name in file_name: if ".jar" in name: path = os.path.join(root, name) list_file += ':' + path

print list_file

os.environ['CLASSPATH'] = os.environ['CLASSPATH'] + list_file

print os.environ['CLASSPATH']

code rider
  • 49
  • 8