0

I'm currently working on a project that makes use of hadoop (2.7.0) I have a two node cluster configured and working (for the most part). I can run mapper / reducer jobs manually withoud any problems. But when I try to start a job with hadoopy I get a error. After debugging the error I see it origionates from the following command that is executed by hadoopy:

hadoop fs -mkdir _hadoopy_tmp

This yields the error:

mkdir: '_hadoopy_tmp': No such file or directory

When doing it manually mkdir works fine if I start my file direcotry name with a '/' in front of it. If I don't start with the '/' I get the same error as above. Same goes with the ls command (ls / gives me a result, ls . gives me a error that there is no such file or directory). I'm guessing that I screwed up in the configuration of hadoop somewhere. I just cant figure out where.

EDIT: to clearify: I'm aware that you should use the mkdir command with a direct path (ea / in front of it). When interacting with hadoop trough the terminal I do this. However the hadoopy framework seems not to do it (it throws the error as shown above). my question is: is there a fix/workaround for this in hadoopy, or do I have to rewrite there source code?

Nick Otten
  • 702
  • 7
  • 17

1 Answers1

0

I don't understand what is 'manually' for you, but the errors that you are seeing makes perfect sense in my opinion, if you want to create a directory in hadoop FS, you should give the exact path to do it. There isn't problem there, and you didn't screw up anything. I recommend you to do it in this way:

$HADOOP_HOME/bin/hdfs dfs -mkdir /name_of_new_folder/

Pd: I don't know anything of hadoopy, i'm just talking from my experience with hadoop (and some items should be equally handled in both, so that's the reason why i'm answering here, please correct my if i'm wrong)

chomp
  • 1,352
  • 13
  • 31
  • With manually starting A job I mean using the streamer jar in the lib folder and giving it a python -mapper and -reducer. Also when I make maps and such using the command line I give in full paths. The problem however is that hadoopy is not doing that (bug?). Hadoopy is a sort of a wrapper framework that you can use with python to easily access the HDFS and manage your batch jobs. Reading the source of it it seems like it makes temp folders in HDFS to store certain things in. But when doing this its not exact paths (it uses the raw command as I posted it above) – Nick Otten May 18 '15 at 17:37