I have to do certain operations on my input data and write it to hdfs using mapreduce program. My input data looks like
abc
some data
some data
some data
def
other data
other data
other data
and continues in the same way, where abc
,def
are the headers and some data
are records with tab space.
My task is to eliminate the headers and append it to its below records like
some data abc
some data abc
some data abc
other data def
other data def
other data def
Each header will have 50 records.
I am using the default record reader so it reads each line at a time
Now my problem is how do I know that map function has been called for a nth time? Do I have any counter to know that? So that I can use that counter to append the header to string as
if (counter % 50 ==0 )
*some code*
Or else static variables are the only way?