11

I am working on a streaming download (CSV) from Rails 3.2 and am coming up against an issue of the initial page request taking a long time. The following controller code illustrates my issue:

      self.response_body = Enumerator.new do |y|
        10_000_000.times do
          y << "Hello World"
        end
      end 

With the above, the response does seem like its streaming (from a server than can support it... Unicorn, in my case). That said, before it starts streaming it hangs for a much longer time than I'd like. If I change it to the following, it starts much faster:

      self.response_body = Enumerator.new do |y|
        1000.times do
          y << "Hello World"
        end
      end

My understanding is that the response should begin with the first iteration of the loop, but it seems the larger loops are causing that initial load time to lengthen. If each iteration is output as it happens, shouldn't it take the same amount of time to kick off the streaming process, regardless of how many total iterations there will be???

Thanks for any insight you may have!

EDIT:

Here is an explanation of the technique I am attempting. Maybe I am misinterpreting or missing a step?: http://facebook.stackoverflow.com/questions/3507594/ruby-on-rails-3-streaming-data-through-rails-to-client/4320399#4320399

EDIT:

I think Rack-Cache might be causing my problem... can I turn it off for an individual request?

EDIT and SOLVED:

I was wrong about Rack-Cache. i just needed to add self.response.headers['Last-Modified'] = Time.now.ctime.to_s to my response.

Matt Fordham
  • 3,147
  • 10
  • 34
  • 51
  • 1
    I don't get it. The code creates one enumerator from another enumerator and assigns the response_body variable. The stuff to the right will be executed first (unless you have some magic meta stuff going on) and will take longer time the larger the number you put in. You need something more to do streaming but I have no suggestion myself. – froderik Mar 29 '12 at 13:32
  • You probably already checked out http://api.rubyonrails.org/classes/ActionController/Streaming.html – froderik Mar 29 '12 at 13:34
  • See the link I added above for an explanation of the technique. – Matt Fordham Mar 29 '12 at 15:45
  • You aren't yielding. Yield returns the flow of execution every time it's called, so the streamer can then send that bit. In your technique, you aren't yielding, so it's evaluating the entire block (the 100000 times bit) before it can then stream it. – Joe Pym Mar 29 '12 at 15:48
  • Is this happening in development or production? Are you using a reverse-proxy or caching reverse-proxy in front of the application, such as apache, nginx, varnish, squid? – yfeldblum Mar 29 '12 at 15:50
  • 2
    @JoePym: The "y <<" is an alias of Yield, I believe: http://www.ruby-doc.org/core-1.9.3/Enumerator.html#method-c-new – Matt Fordham Mar 29 '12 at 16:06
  • @yfeldblum: I am on Heroku using Unicorn. But, I am having the problem both in Production and Development. I am now trying to see if some middleware is messing things up. Rack-Cache could be getting in the way, but I don't think that is running in development (using Foreman start). – Matt Fordham Mar 29 '12 at 16:08
  • I *think* Rack-Cache might be causing my problem... can I turn it off for an individual request? – Matt Fordham Mar 29 '12 at 17:16

1 Answers1

13

The edited question turned out to contain exactly the answer I needed. Posting it here as an answer.

The answer to getting the Rack handler to stream properly is apparently to add a Last-Modified header to the response:

self.response.headers['Last-Modified'] = Time.now.ctime.to_s
William Denniss
  • 16,089
  • 7
  • 81
  • 124
mbklein
  • 455
  • 6
  • 14