I think my question get confused to everyone.Making little more clear. I am trying to order my data. say my data(few records) is like this
0 1 2 3 4
1 3 8 9 2
2 8 7 9 7
and my block size is 128 MB and file size is 380 Mb(3 blocks) I am trying to give an order number to my records.
1,0 1 2 3 4
2,1 3 8 9 2
3,2 8 7 9 7
For giving the correct number I need to get data into 1 map else if I get 3 map tasks my numbering wont be correct.
So if I am doing so I will get whole data as it is right? No changes will be happened to the data that get entered to my mapper class, it will be my original data,is'nt it?
And once I set no of mappers to 1 using
-D mapreduce.job.maps=1
or
conf.setInt("mapreduce.job.running.map.limit", 1);
my output generates 3 part-m-000* files
I am using Hadoop 2.6.0-cdh5.4.7 cloudera version.
Am I doing anything wrong? Please advice