Heroku + ephemeral filesystem + AWS S3

Question

I have developed an application that generates large files (700mb+) by using data from MySQL and then serves them to users.

I am now migrating the app to Heroku.

In order to upload a file to Amazon S3, that file has to be generated in the filesystem first or can be uploaded as a string since Heroku can't guarantee your file will be there (dyno might restart or fail for whatever reason).

Files are going to be pretty big so multipart upload will be used (I am not sure if string uploading can be done in parts).

I don't know if my plan is going to work correctly, or if there is a better way of doing this. What is something goes wrong and the dyno fails during the request?

How I think it should work: Let's assume that the app has started fetching data from database, generates a 5MB string and sends it to AWS, and loops through the dataset until the complete file is sent.

score 1 · Accepted Answer · answered May 09 '14 at 20:20

1

I don't know if my plan is going to work correctly

If the experience of others is any indication, the answer is nope. In this post dated May 20, 2013, a developer documented his experience with Heroku versus AWS: "Why I left Heroku, and notes on my new AWS setup" http://www.holovaty.com/writing/aws-notes/

I would suggest using an Amazon EC2 reserved instance. To get started, you can buy a second-hand reserved instance reservation ("Third Party"). I recently bought three reservations for $20 down plus subscription of under $15 per month for the remainder of the tenancy (two years) to run them in three regions, and I couldn't be any happier.

Amazon can write faster to S3 than anyone else, so if this is an issue, EC2 has an advantage right there. Plus, Amazon will not make upgrades for you that would break compatibility the way Heroku has been known to. I hope this helps.

answered May 09 '14 at 20:20

NightKnight on Cloudinsidr.com

664
5
11

So the difference is mainly that in AWS you can write to EC2 filesystem and then send it to S3 / EBS volume, right? – ggirtsou May 10 '14 at 06:38
1

Yes. You would want to create an EBS volume and write to that volume. EBS is persistent network-attached storage. Amazon's so-called "instance store", on the other hand, is ephemeral (it will vanish when you stop the instance and is not recommended except for things like caches or swap files). – NightKnight on Cloudinsidr.com May 10 '14 at 10:10
1

EBS is only accessible to a running EC2 instance (S3 is accessible independently but much slower). If something happens to the instance, your EBS volumes will persist (unless you asked Amazon to destroy them along with the instance). Then you can re-attach your EBS to a new instance and access your data like nothing happened. To backup an EBS volume, create snapshots. Amazon stores them on S3 for you. If something happens to your EBS volume, you can recover the last state you backed up from a snapshot by creating a new volume and attaching it to an instance. [Please vote if I am being helpful] – NightKnight on Cloudinsidr.com May 10 '14 at 10:21
Thank you. Do you have a recommended book on AWS? – ggirtsou May 10 '14 at 10:56
Writing one... I will send you a free copy when it's finished. – NightKnight on Cloudinsidr.com May 10 '14 at 11:24

Heroku + ephemeral filesystem + AWS S3

1 Answers1

Linked