There is a little amount of meta-data that I get by looking up the current file the mapper is working on (and a few other things). I need to send over this meta-data to the reducer. Sure, I can have the mapper emit this in the < Key, Value> pair it generates as < Key, Value + Meta-Data>, but I want to avoid it.
Also, constraining myself a little bit more, I do not want to use DistributedCahce. So, do I still have some options left? More precisely, my question is twofold
(1) I tried setting up some parameters by doing a job.set(Prop, Value) in my mapper's configure(JobConf) and doing a job.get() in my reducer's configure(JobConf). Sadly, I found it does not work. As one aside, I am interested in knowing why this behavior. My main question is
(2) How can I send the value from the mapper to the reducer in a "clean way" (if possible, within the constraints I want).
EDIT (In view of response by Praveen Sripati)
To make it more concrete, here is what I want. Based on the type of data emitted we want it stored under different files (say data d1 ends up in D1 and data d2 ends up in D2).
The values D1 and D2 can be read in config file and figuring out what goes where depends on the value of map.input.file. That is, the pair < k1, d1> after some processing should go to D1 and < k2,d2> should go to D2. I do not want to emit things like < k1, d1+D1>. Can, I somehow obtain figure out the association without emitting D1 or D2, maybe by cleverly using the config file? The input source (i.e., input directory) for k1,d1 and k2,d2 is the same which again can be seen only through map.input.file
Please let me know when you get time.