2

I was wondering if there was any way to turn off shuffling/sorting in the Map phase of a job? My job doesn't require a Reduce phase so I don't need the shuffle and sort.

Im using hadoop version 2.2.0

Thanks

Mo.
  • 40,243
  • 37
  • 86
  • 131

1 Answers1

1

You can setNumReduceTask to 0 which will just map the data without shuffling and sorting.

Ajay Gupta
  • 3,192
  • 1
  • 22
  • 30
  • 1
    Thanks. Just to help other people coming to the post. You set the number of reduce tasks to zero by calling ``yourJob.setNumReduceTask(0);`` in your main method of your MR job. – Mo. Jul 15 '14 at 18:51
  • In effect, does this mean that each job will only access local data? I'm trying to answer [this question](http://stackoverflow.com/q/31789176/320399). – blong Aug 03 '15 at 15:06