0

I am facing an architecture-related problem:

I have created a new environment in ElasticBeanstalk and pushed my app there. All good so far. I have set it to auto scale up/down.

My app depends on filesystem storage (it creates files and then serves them to users). I am using an EBS volume (5gb large) to create the files and then push them to S3 and delete them from EBS. The reason I'm using EBS is because of ephemeral filesystem in EC2 instances.

When AWS scales up new instances don't have the EBS volume attached because EBS can be attached to one instance at a time.

When it scales down, it shuts down the instance that had the EBS volume attached, which totally messes things up.

I have added to /etc/fstab a special line that will automatically mount the EBS volume to /data but that only applies for the instance I add the file to /etc/fstab. I guess the solution here would be to create a customized AMI image with that special line. But again, EBS can't be attached to more than one instance at a time, so it seems like a dead end.

What am I thinking wrong? What would be a possible solution or the proper way of doing it?

For some reason, I believe that using S3 is not the right way of doing it.

Community
  • 1
  • 1
ggirtsou
  • 2,050
  • 2
  • 20
  • 33
  • you have to have one EBS volume per each instance. not sure how to make this in EB though. on EC2 you just change AMI to start with mounted EBS volume and update autoscaling conf to use this AMI. – Dmitry Mukhin Jun 14 '14 at 08:31

1 Answers1

1

S3 is a fine way to do it: your application creates the file, uploads to S3, removes the file from the local filesystem, and hands a URL to access the file back to the client. Totally reasonable. Why you can't use ephemeral storage for this. Instance store-backed instances have additional storage available, mounted to /mnt by default. Why can't the application create the file there? If the files don't need to be persisted between instance start/stop/reboot then there's no great reason to use EBS (unless you want faster boot times for your autoscale instances I suppose).

Ben Whaley
  • 32,811
  • 7
  • 87
  • 85
  • File creation takes about 30 minutes. It retrieves almost a million rows from database and generate XML files (takes time). I am worried that during file generation if a fail occurs the application will have to start over and make the user wait even longer. File creation is done in portions (ie 1000 rows at a time). – ggirtsou Jun 12 '14 at 17:06
  • What about using an S3-based file system? [Here is a comparison of a few options](https://code.google.com/p/s3ql/wiki/other_s3_filesystems). – Ben Whaley Jun 12 '14 at 17:37
  • I'm sorry, I don't know how this helps. Is it possible to do this using AWS tools? – ggirtsou Jun 12 '14 at 17:48
  • 1
    I'm suggesting that you could mount the S3 bucket on your instances and write the file there. If an instance fails, or if a new instance comes online due to autoscale, they'll all have access to the bucket. You have to make sure the instances aren't accessing the same file at the same time. This way you wouldn't have to write to a local filesystem at all. – Ben Whaley Jun 12 '14 at 17:53
  • That's very interesting. Since I am dealing with large files, though when creating them, I have to re-upload the entire file. That's not efficient or cheap. How would you mount it to the filesystem? What's the difference of sending the file to an S3 bucket via AWS API using their SDK? – ggirtsou Jun 12 '14 at 18:34
  • I see you accepted my answer - just curious, did you end up using an S3-mounted filesystem? Or did you end up with another solution? – Ben Whaley Jun 16 '14 at 20:31
  • Indeed I did. I'm using PHP and there's a way to stream data to S3 so it works like a filesystem. http://docs.aws.amazon.com/aws-sdk-php/guide/latest/feature-s3-stream-wrapper.html Thank you so much for your help. It took me a while to digest it as a solution (because I thought S3 shouldn't be used that way). – ggirtsou Jun 16 '14 at 20:33
  • Perfect! Glad to hear you sorted it out. – Ben Whaley Jun 16 '14 at 20:34
  • I would like to point out that using s3 wrapper (`s3://`) is terribly slow in t1.micro instance. It's better to fully create the file locally, upload it to S3 and delete local file. Huge different in performance. – ggirtsou Jul 01 '14 at 07:29