1

I have a Sinatra app which needs to provide downloadable reports in Microsoft Word format. My approach to creating the reports is to generate the content using ERB, and then convert the resulting HTML into docx. Pandoc seems to be the best tool for accomplishing this, but my implementation involves generating some temporary files which feels kludgy.

Is there a more direct way to generate the docx file and send it to the user?

I know that PandocRuby exists, but I couldn't quite get it working for my purposes. Here is an example of my current implementation:

  #setting up the docx mime type    
   configure do
      mime_type :docx, 'application/vnd.openxmlformats-officedocument.wordprocessingml.document'
  end

  # route to generate the report
  get '/report/:name' do
    content_type :docx

    input = erb :report, :layout=>false #get the HTML content for the input file
    now = Time.now.to_i.to_s #create a unique file name
    input_path = File.join('tmp', now+'.txt')
    f = File.new(input_path, "w+") 
    f.write(input.to_s) #write HTML to the input to the file
    f.close()

    output_path = File.join('tmp', now+'.docx') # create a unique output file
    system "pandoc -f html -t docx -o #{output_path} #{input_path}" # convert the input file to docs
    send_file output_path
  end
Eric Levine
  • 13,536
  • 5
  • 49
  • 49
  • Looks to me like `/report/:name` will always render an erb with the latest values of some dataset and you want to be able to hit it to download docx on demand at that point in time. Since you can get Sinatra to [stream binary data](http://sinatra.rubyforge.org/api/classes/Sinatra/Streaming.html), I think the real question you want to ask is how to get Pandoc to write to the streaming buffer. That'd be the cleanest solution imo. Taking the generation out of the request and into a background job or a prerender would be better, if applicable. – danneu Nov 07 '12 at 22:03
  • Duplicate of http://stackoverflow.com/questions/697505/creating-microsoft-word-docx-documents-in-ruby ? – JasonPlutext Nov 07 '12 at 22:38
  • @JasonPlutext I don't think it is a duplicate, since I'm not asking how to create the docx files. I already have it working with pandoc (which isn't even mentioned in that question), but I'm looking for a way to improve upon it. I'd like to get rid of the temporary files my code creates. – Eric Levine Nov 07 '12 at 23:46
  • @danneu I think you are right, that doing this as a background job or pre-building the files is a better long term solution. – Eric Levine Nov 07 '12 at 23:54
  • Right, I misunderstood. When you said "more direct way to generate the docx file" I thought you meant more direct, as in getting rid of pandoc entirely. – JasonPlutext Nov 08 '12 at 00:43
  • Have you tried sending the file to pandoc's stdin, and reading from its stdout? Seems like the most straightforward approach. – Catnapper Nov 08 '12 at 02:17
  • @Catnapper Pandoc won't send docx files to stdout – Josh Voigts Nov 08 '12 at 14:20
  • Is this on Linux? If so, it's possible to use pipes to circumvent Pandoc not using its stdout for docx. – Catnapper Nov 08 '12 at 19:13
  • @Catnapper This is actually running on a Windows server (not my decision or preference), but I would still accept an answer that uses pipes. – Eric Levine Nov 09 '12 at 17:35
  • On linux, you can get pandoc to send a docx to stdout this way: `pandoc -t docx -o /dev/stdout`. Maybe there's a Windows equivalent? – John MacFarlane May 16 '13 at 20:20
  • @JohnMacFarlane Windows aside, how would I capture stdout and send it to the browser using Ruby/Sinatra? Would it be send_file "/dev/stdout"? – Eric Levine May 17 '13 at 01:44

1 Answers1

0

A recent update to pandoc-ruby added support for piping binary output to standard output. Does that solve your problem?

I don't have any experience with Sinatra, and I have not tried to use pandoc-ruby to pipe binary output, but something like

puts PandocRuby.convert(input, :from => :html, :to => :docx)

might do the trick.

David Sanson
  • 136
  • 1
  • 4