2

I've been hunting around and can't seem to find a good solution for this. My Rails app stores it's files in Amazon S3. I now need to send them to a remote (3rd party) service.

I'm using RestClient to post to the 3rd party server like this:

send_file = RestClient::Request.execute(
    :method => :post,
    :url => "http://remote-server-url.com",
    :payload => File.new("some_local_file.avi", 'rb'),
    :multipart => true,
    etc.... )

It works for local files, but how can I send a remote file from S3 directly to this 3rd party service?

I found an answer here where someone was using open-uri: ruby reading files from S3 with open-URI

I tested that for myself, and it worked.

:payload => open(URI.parse("http://amazon-s3-example.com/some_file.avi"))

But, I've read a comment here that says open-uri simply loads the remote file into memory. See last comment on this answer: https://stackoverflow.com/a/264239/2785592

This wouldn't be ideal, as I'm handling potentially large video files. I've also read somewhere the RestClient loads even local files into memory; again, this isn't ideal. Does anyone know if that's true?

Surely I can't be the only one that has this problem. I know I could download the S3 file locally before sending it, but I was hoping to save on time & bandwidth. Also, if RestClient truly does load even local files to memory, than downloading it locally doesn't save me anything. Heh heh.

Any advice would be much appreciated. Thanks :)

Update: The remote server is just an API that responds to post requests. I don't have the ability to change anything on their end.

Community
  • 1
  • 1
BoomShadow
  • 912
  • 1
  • 18
  • 31
  • Could the `remote-server-url.com` just download the file itself? It would be much easier for you and you will be able to get rid of the extra network round-trip "download the file to my server and send it again to a remote one". – Alexey Shein Sep 26 '15 at 20:40
  • Awesome answer on your question is here: http://stackoverflow.com/a/12282709/1426097, with 0-responsibility on your server :) – dimakura Sep 26 '15 at 20:42
  • @AlexeyShein, unfortunately I don't have any control over what the remote server does. It's just an API that's listening for a file payload. – BoomShadow Sep 26 '15 at 20:53
  • @dimakura I'll give that a try, but I'm not sure that's right. I can already get a public-facing URL of the files in S3 from inside the S3 dashboard. I can even set permissions to make them public. Is that answer you linked doing it differently? I wasn't sure from reading. Thanks for the advice. – BoomShadow Sep 26 '15 at 20:53
  • @BoomShadow it's ideal answer for number of reasons. (1) It allows you to keep your files private (no need for setting them from dashboard). (2) It removes load from your server. (3) It also gives a recipe how to mask amazon server, to look like user actually downloads link from you. – dimakura Sep 26 '15 at 20:56
  • @dimakura How does it help with sending a file to a remote server? From the link above it just masks amazon endpoint to look like your server. – Alexey Shein Sep 26 '15 at 21:22
  • @AlexeyShein masking actual domain is just of a minor importance in this answer. The whole point was on how to share files (even private ones) without overloading Rails app. – dimakura Sep 26 '15 at 21:28
  • @dimakura Sorry, but I still don't get how it helps to **copy** the file stored on amazon s3 to a remote api that accepts only POST requests (i.e. accepts uploading files and stores them itself somehow). Could you please show a code example? – Alexey Shein Sep 26 '15 at 21:35
  • If you give them a link isn't it enough to download the file? – dimakura Sep 26 '15 at 21:39
  • @dimakura Unfortunately not. The remote API doesn't let me just give them a link. It doesn't have the ability to go fetch remote files. It expects that I'm presenting a file directly in the payload. Basically, the question should be: **how to POST a file to a remote API from S3?** – BoomShadow Sep 26 '15 at 22:47

1 Answers1

3

Take a look at: https://github.com/rest-client/rest-client/blob/master/lib/restclient/payload.rb

RestClient definitely supports streamed uploads. The condition is that in payload you pass something that is not a string or a hash, and that something you pass in responds to read and size. (so basically a stream).

On the S3 side, you basically need to grab a stream, not read the whole object before sending it. You use http://docs.aws.amazon.com/sdkforruby/api/Aws/S3/Client.html#get_object-instance_method and you say you want to get an IO object in the response target (not a string). For this purpose you may use an IO.pipe

reader, writer = IO.pipe

fork do 
    reader.close
    s3.get_object(bucket: 'bucket-name', key: 'object-key') do |chunk|
      writer.write(chunk)
    end
end

writer.close

you pass in the reader to the RestClient::Payload.generate and use that as your payload. If the reading part is slower than the writing part you may still read a lot in memory. you want, when writing to only do accept the amount you are willing to buffer in memory. You can read the size of the stream with writer.stat.size (inside the fork) and spin on it once it gets past a certain size.

Mircea
  • 10,216
  • 2
  • 30
  • 46
  • Wow. This makes a lot of sense. I'm going to give this a try. I'm marking it as the accepted answer as it looks like exactly what I need. Thanks so much! You are truly fantastic. – BoomShadow Sep 29 '15 at 00:03
  • sure. lmk if this works. if you run in any snags along the way I can actually try to put together a complete code sample. – Mircea Sep 29 '15 at 03:33