-2

I want to process password protected zipped files using Hadoop mapreduce. I was able to process unprotected zip files using ZipFileInputformat. But it doesn't support password protected zips. Is there any Java library that provide stream access to password protected zip files or extract zip files if I can make its byte content available ? Thanks in Advance.

Petros Koutsolampros
  • 2,790
  • 1
  • 14
  • 20
InfamousCoconut
  • 784
  • 8
  • 19
  • 1
    If you take the core part of the question : http://stackoverflow.com/questions/166340/write-a-password-protected-zip-file-in-java. (please remove unwanted tags. how map reduce is related to zip-with-password?) – Jayan Nov 02 '13 at 04:28
  • @Jayan ,Thanks for the link.I have referred it before and most of libraries require File Object with path of zip file.I think File object cannot be used in mapreduce Context.I was looking for libraries that can work if I could make InputStream or byte content of zip available. – InfamousCoconut Nov 02 '13 at 16:34

1 Answers1

0

Assuming you can find a java library that can read password protected zip files (see this blog article for an example), you should be able to modify the current ZipFileInputFormat to use this library and then you'll just need to configure the password for each zip file via a configuration option (hopefully you don't have too many files, or all the files are protected using the same password).

It should be easy enough. Give it a try and if you run into problems, post another question - or ask author of the input format (https://github.com/cotdp/com-cotdp-hadoop is one possible implementation of ZipFileInputFormat i found via google) as to whether he can roll the update for you

Chris White
  • 29,949
  • 4
  • 71
  • 93
  • What I'm stuck is that most of the libraries I found require File Object for extracting contents.I was looking for libraries that would work with InputStream or byte content of zip. – InfamousCoconut Nov 03 '13 at 12:42