2

We have lots of files in S3 (>1B), I'd like to compress those to reduce storage costs. What would be a simple and efficient way to do this?

Thank you

Alex

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
AlexV
  • 3,836
  • 7
  • 31
  • 37
  • 1
    S3 Batch is probably what you're looking for. – Dunedan Feb 21 '21 at 14:55
  • 1
    Is your aim simply to reduce costs? Are you familiar with [Object Storage Classes – Amazon S3](https://aws.amazon.com/s3/storage-classes/)? You can reduce the cost of storage, but it is a trade-off with durability and access speed. For example, the Glacier Deep Archive storage class can reduce storage costs by 95%, but data is not immediately accessible. This would be a LOT simpler than compressing files. Can you tell us more about how these files are used? – John Rotenstein Feb 21 '21 at 21:43
  • @JohnRotenstein Yes, storage costs is the main driver here. But I need to keep data accessible. – AlexV Feb 23 '21 at 08:57
  • @Dunedan You mean use S3 batch with custom Lamda, that will compress the files? – AlexV Feb 23 '21 at 08:58

1 Answers1

2

Amazon S3 cannot compress your data.

You would need to write a program to run on an Amazon EC2 instance that would:

  • Download the objects
  • Compress them
  • Upload the files back to S3

An alternative is to use Storage Classes:

  • If the data is infrequently accessed, use S3 Standard - Infrequent Access -- this is available immediately and is cheaper as long as data is accessed less than once per month
  • Glacier is substantially cheaper but takes some time to restore (speed of restore is related to cost)
John Rotenstein
  • 241,921
  • 22
  • 380
  • 470