3

I want to upload a HDF5 file created with h5py to S3 bucket without saving locally using boto3.

This solution uses pickle.dumps and pickle.loads and other solutions I have found, store the file locally which I like to avoid.

SaTa
  • 2,422
  • 2
  • 14
  • 26
  • 1
    `pickle.dump` dumps it as a file, but you can use `pickle.dumps` to dump it as bytes. So then you can upload that bytes to S3. – Sraw Dec 15 '18 at 01:19
  • Thanks, but then reading the uploaded file, one needs to use ```pickle.loads```, right? I want to upload the file as a HDF5 file so that it can be just read as a HDF5 file without the need for ```pickle.loads```. – SaTa Dec 15 '18 at 04:35

1 Answers1

1

You can use io.BytesIO() to and put_object as illustrated here 6. Hope this helps. Even in this case, you'd have to 'store' the data locally(though 'in memory'). You could also create a tempfile.TemporaryFile and then upload your file with put_object. I don't think you can stream to an S3 Buckets in the sense that the local data would be discarded as it is uploaded to the Bucket.

  • What is the appropriate x for `put_object(h5bytestream, ContentType=x)`? The example linked uses `ContentType='image/png'` but I'm not sure how to find the appropriate MIME type descriptor – captaincapsaicin Nov 12 '20 at 23:21
  • you don't actually need to specify `ContentType`, it's optional – duff18 Mar 27 '22 at 16:49