I am working on a map-reduce job, consisting multiple steps. Using mrjob every step receives previous step output. The problem is I don't want it to.
What I want is to extract some information and use it in second step against all input and so on. Is it possible to do this using mrjob?
Note: Since I don't want to use emr, this question is not much of help to me.
UPDATE: If it would not be possible to do this on a single job, I need to do it in two separate jobs. In this case, is there any way to wrap these two jobs and manage intermediate outpus, etc?