I set Reducer to 0, running only Mapper job. Lets say 10 Nodes are executing the Mapper job. I understand 10 Mappers produces 10 files in HDFS. But, how many number of files would be produced as in final output for the result?
Asked
Active
Viewed 84 times
-2
-
1... You answered your own question, no? – OneCricketeer Mar 24 '18 at 13:11
-
No.say if reducers enabled and despite number of reducers executed the aggregation, the result file will be one. So my question is about how many number of files without reducers? – Sats Mar 25 '18 at 14:41
-
2But reducers aren't enabled... You said you set it to zero. You also said 10 mappers makes 10 files, which is correct, so what is your actual question? – OneCricketeer Mar 25 '18 at 19:22
1 Answers
0
A job without reducers is a Map-only job, every mapper will produce an output file.
Please check this answer out: `
When you have a map-only task, there is not shuffling at all, which means that mappers will write the final output directly to the HDFS.

dbustosp
- 4,208
- 25
- 46
-
-
@Sathish Yes. Unless you have 1reduce, that way you will have to check only one file. However, that reduce potentially will become a bottleneck. – dbustosp Mar 25 '18 at 15:18
-
Agree. So, if 5 reducers enabled, then the end result file will be one, isn't it? – Sats Mar 26 '18 at 15:45
-
-