-1

I store many (>10M) small objects in S3. Each object is rather small (~5KB). While the storage costs are acceptable, I realised that request costs are becoming very high. Uploading the 10M objects costs 50$, which is a bit expensive.

Is there a way to upload new data more cost-effectively? I'm open to using other services as well. The usage is roughly: upload new versions once a year, download potentially once a month.

I found this question when researching, but it is from 10 years ago, so I was wondering if something changed.

Rizhiy
  • 775
  • 1
  • 9
  • 31

1 Answers1

1

The average size 5kb is rather small for S3. You'll face latency, performance, and cost issues down the road. Consider either of 2 solutions:

  • Consolidate data into bigger chunks (S3 is optimal at hundreds of megabytes range). You can keep publishing with small packets of data, but regularly aggregate them into bigger objects used later.
  • Use a different storage engine which suits your needs better. Consider DynamoDB, Cassandra, Couchbase, or even PostgreSQL, MySQL, depends on your requirements. 10M*5kb=50GB which is tiny, you can store it even in Redis if you don't need persistence.
John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
Sergey Romanovsky
  • 4,216
  • 4
  • 25
  • 27
  • I need this to be as cost-efficient as possible. As I said, most of the time the data just sits there. DynamoDB IA costs are $0.10/GB-month that is 8x S3 IA. Similar for Aurora. I don't care about latency and performance, since this data is read infrequently on jobs that can wait. – Rizhiy Jul 14 '23 at 12:49
  • Then the first option I proposed should fit you. Batch you data and query it with Athena or lambda. – Sergey Romanovsky Jul 15 '23 at 18:33