8

One of the responsibilities of my Rails application is to create and serve signed xmls. Any signed xml, once created, never changes. So I store every xml in the public folder and redirect the client appropriately to avoid unnecessary processing from the controller.

Now I want a new feature: every xml is associated with a date, and I'd like to implement the ability to serve a compressed file containing every xml whose date lies in a period specified by the client. Nevertheless, the period cannot be limited to less than one month for the feature to be useful, and this implies some zip files being served will be as big as 50M.

My application is deployed as a Passenger module of Apache. Thus, it's totally unacceptable to serve the file with send_data, since the client will have to wait for the entire compressed file to be generated before the actual download begins. Although I have an idea on how to implement the feature in Rails so the compressed file is produced while being served, I feel my server will get scarce on resources once some lengthy Ruby/Passenger processes are allocated to serve big zip files.

I've read about a better solution to serve static files through Apache, but not dynamic ones.

So, what's the solution to the problem? Do I need something like a custom Apache handler? How do I inform Apache, from my application, how to handle the request, compressing the files and streaming the result simultaneously?

tshepang
  • 12,111
  • 21
  • 91
  • 136
Rômulo Ceccon
  • 10,081
  • 5
  • 39
  • 47
  • The ZIP file format index is at the end of the file. Also I glanced quickly through RFC 2616 (HTTP 1.1) and variable length response like that probably works although usually content length should be announced. Technically this should be possible as far as I can see. – erloewe Feb 08 '11 at 16:04
  • There's no HTTP problem with not knowing the length in advance, this is what chunked transfer encoding is for. You can write bytes that look like a zip file in any language, just be sure to flush your output periodically. – covener Feb 09 '11 at 12:59

3 Answers3

3

Check out my mod_zip module for Nginx:

http://wiki.nginx.org/NgxZip

You can have a backend script tell Nginx which URL locations to include in the archive, and Nginx will dynamically stream a ZIP file to the client containing those files. The module leverages Nginx's single-threaded proxy code and is extremely lightweight.

The module was first released in 2008 and is fairly mature at this point. From your description I think it will suit your needs.

Emiller
  • 1,472
  • 12
  • 8
0

it's tricky to do, but I've made a gem called zipline ( http://github.com/fringd/zipline ) that gets things working for me. I want to update it so that it can support plain file handles or paths, right now it assumes you're using carrierwave...

also, you probably can't stream the response with passenger... I had to use unicorn to make streaming work properly... and certain rack middleware can even screw that up (calling response.to_s breaks it)

if anybody still needs this bother me on the github page

fringd
  • 2,380
  • 1
  • 18
  • 13
0

You simply need to use whatever API you have available for you to create a zip file and write it to the response, flushing the output periodically. If this is serving large zip files, or will be requested frequently, consider running it in a separate process with a high nice/ionice value / low priority.

Worst case, you could run a command-line zip in a low priority process and pass the output along periodically.

covener
  • 17,402
  • 2
  • 31
  • 45