0

am trying to process a file. as of now am getting the output as shown below.

input file:-
c=1,2,3
a,b,c,d,a
d,e,f
g,h,i,i
c=2,3,4
j,k,l
m,n,a,h
c=3,2,5
d,g,a
s,fs,a


    expecting an output like:-
    c=1,2,3,a,b,c,d,a
    c=1,2,3,d,e,f
    c=1,2,3,g,h,i,i
    c=2,3,4,j,k,l
    c=2,3,4,m,n,a,h
    c=3,2,5,d,g,a
    c=3,2,5,s,fs,a

is there any other way we can get the output something like.

    Another output format:-
    c=1,2,3,{(a,b,c,d,a),(d,e,f),(g,h,i,i)}
    c=2,3,4,{(j,k,l),(m,n,a,h)}
    c=3,2,5,{(d,g,a),(s,fs,a)}

Could some one help me. Am trying with pig but am no where close to this,I am trying to solve this problem with pig to get some practice.

Thanks & Regards, Ankush Reddy

ankush reddy
  • 481
  • 1
  • 5
  • 28

1 Answers1

0

I don't think it's possible with pig. Pig is parallel processing then it cannot know the record order in file. So I suggest you pre-process it with bash script or other tool before process with pig.

dltu
  • 34
  • 8
  • if the file is too big then we cannot process that with the bash script as well it will take hours for that to be done. any other suggestions @Duc LT. Thank You. – ankush reddy Jul 07 '16 at 16:32