I have many xml files on hdfs which i extracted from a sequence files using java program.
Initially, the files were few so I copied the extracted xml files onto my local and then ran a unix zip command then zipped the xmls into a single .zip file.
The no of xml files have now increased and now i cant copy them onto local because I will run out of memory.
My need is to just zip all of those xml files(on hdfs) into a single zipped file(to hdfs) without a need of copying it to local.
I couldnt find any lead to start.. Can anyone provide me a start point or any code(even java MR) they have so that I can go further. I could see this can be done using mapreduce but I have never programmed in it thats why trying other ways
Thanks in advance..