Read S3 zip file and extract the content as separate file

Question

My Question is not related to reading S3 file in stream.

My question is more specific to reading zipfile(multiple files) and unzip & extract content with file name and save within S3 folder.

I am trying to read S3 zip file and extract the content as separate small individual file using Scala. I am able to achieve the functionality for smaller files using code below but larger files(above 10mb) unable extract data. Unable to extend existing Buffer size. please let me know if there is any alternative approach.

def pr_extract(object_Key :String, bucket_Name:String, folder:String,s3Client: com.amazonaws.services.s3.AmazonS3Client) : Unit =
   {
      val buffer = new Array[Byte](12582912)
      val awsCreds = new BasicAWSCredentials(access_Key, secret_Key)
      val s3Client = new AmazonS3Client(awsCreds)
      s3Client.setEndpoint("xxxxxxxxxxxxxx")
      val s3object = s3Client.getObject(new GetObjectRequest(bucket_Name, object_Key))
      val zis = new ZipInputStream(s3object.getObjectContent())
      var entry = zis.getNextEntry()
      while (entry != null) {
        val fileName = entry.getName
        val outputStream = new ByteArrayOutputStream()
        val len = zis.read(buffer)
        while (zis.read(buffer) > 0) {
          outputStream.write(buffer, 0, len)
        }
        val is = new ByteArrayInputStream(outputStream.toByteArray())
        val meta = new ObjectMetadata()
        meta.setContentLength(outputStream.size())
        meta.setContentType("application/csv")
        s3Client.putObject(bucket_Name, FilenameUtils.getFullPath(folder) + fileName, is, meta)
        is.close
        outputStream.close()
        entry = zis.getNextEntry()
      }
    }

Read S3 zip file and extract the content as separate file

0 Answers0