2

I have a text input file and it is delimited by line breaks. In each mapper, I need to read the next line of my key/value. for example, in this data:

L1

L2

L3

I need something like this:

L1

L2

and in the next mapper:

L2

L3

thanks in advance.

Masoud
  • 35
  • 1
  • 8
  • 1
    HI, SO is not a code writing service - its best if you have a go and then post the code you've written with specific issues you are experiencing – Tim Rutter Jan 26 '17 at 11:49
  • I did not request any code. its best for you to read questions with much more concentration. – Masoud Jan 26 '17 at 12:00
  • 1
    do please read the how to ask a question page: http://stackoverflow.com/help/how-to-ask You need to show some evidence that you've actually had a go at this and aren't just posting the problem without thinking about it. Its just courtesy. – Tim Rutter Jan 26 '17 at 14:11

2 Answers2

2

In addition to CustomInputFormat, you can store before line in collection like Map and access it each next call

example:

enter image description here

Ronak Patel
  • 3,819
  • 1
  • 16
  • 29
0

You need to write custom InputFormat class that will read your file and split them in records of two lined. The standard TextFileInput reads one line at the time and hands the resulting stream to sorter. So your file will lose its ordering of lines at the very beginning of the process.

Here is more information about this.

Vlad
  • 9,180
  • 5
  • 48
  • 67
  • Thank Vlad, by your help I've followed creating an custom InputFormat class and find this article: http://analyticspro.org/2012/08/01/wordcount-with-custom-record-reader-of-textinputformat/ .here's another problem. When I'm using the code it says "The type RecordReader cannot be the superclass of NLinesRecordReader; a superclass must be a class" – Masoud Jan 26 '17 at 13:03
  • @Masoud , I have the same requirement which you've specified. Were you able to resolve the mentioned issues? Appreciate any reference on a successful implementation of this scenario. – Bhanuka Withana Apr 26 '18 at 19:48
  • Dear @BhanukaWithana, I used the same method that Ronak proposed in the accepted answer. – Masoud Jun 02 '18 at 09:58