I've got a method that is suppossed to append lines on each call to an already existing CSV in my HDFS. I use CDH 5.0.
My code looks like this:
else if (Config.getStorageMethod().equals("hdfs")){
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://10.9.0.108:8020");
conf.set("hadoop.job.ugi", "hdfs");
conf.set("dfs.support.append", "true");
conf.set("hadoop.home.dir", "/usr/lib/hadoop");
Path pt= new Path("/user/logger/test.csv");
FileSystem fs = FileSystem.get(conf);
FSDataOutputStream fsout = fs.append(pt);
writer = new PrintWriter(fsout);
writer.append(sl.getIp());
writer.append(',');
writer.append(sl.getTerm());
writer.append(',');
writer.append(sl.getFrom());
writer.append('\n');
writer.flush();
writer.close();
The client is remote running on a webserver that does not have Hadoop installed on. I am getting the following exception:
2014-05-28/12:53:11.976 WARN: util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2014-05-28/12:53:11.977 DEBUG: security.JniBasedUnixGroupsMappingWithFallback - Falling back to shell based
2014-05-28/12:53:11.977 DEBUG: security.JniBasedUnixGroupsMappingWithFallback - Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
2014-05-28/12:53:11.983 DEBUG: util.Shell - Failed to detect a valid hadoop home directory
java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:265)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:290)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
at org.apache.hadoop.security.Groups.<init>(Groups.java:77)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:232)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:718)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:703)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:605)
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2554)
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2546)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2412)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:167)
at CSVWriter.generateCsvFile(CSVWriter.java:23)
at LoggerExecutor.run(LoggerExecutor.java:13)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
2014-05-28/12:53:11.986 ERROR: util.Shell - Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
at org.apache.hadoop.security.Groups.<init>(Groups.java:77)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:232)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:718)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:703)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:605)
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2554)
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2546)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2412)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:167)
at CSVWriter.generateCsvFile(CSVWriter.java:23)
at LoggerExecutor.run(LoggerExecutor.java:13)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
I get the feeling that it is trying to search for the env variables on my local client for the Hadoop home (Which doesn't exist) while it should be searching for it remotely if it needed it.
Any ideas?