I have a 4G file with ~ 16 mill lines, maps are running distributed with 6 maps in parallel out of 15 maps. Generates 35000 keys. I am using MultipleTextoutput so each reducer generates a output independent of other reducer.
I have configured the conf with 25-50 reducers, but it always runs 1 reducer at a time.
Machine - 4 core 32 G ram single machine running hortonworks stack
How do I get more than 1 reduce task to run in parallel ?