0

In my rails application, I need to upload some doc/xls files and parse its structure and get information. How can I get data from *.doc or *.xls in maybe xml format or anything else that I can read and parse?

sawa
  • 165,429
  • 45
  • 277
  • 381
itdxer
  • 1,236
  • 1
  • 12
  • 40

3 Answers3

1

You can parse different types of spreadsheets using the Roo gem. It supports:

  • OpenOffice
  • Excel
  • Google spreadsheets
  • Excelx
  • LibreOffice
  • CSV

From my experience it has some issues with parsing .xls files, however parsing .xlsx files is good.

As for .doc files, you may try using msworddoc-extractor gem or try one of the solutions proposed here.

Update: working with *.docx files - docx and docx-html

trushkevich
  • 2,657
  • 1
  • 28
  • 37
0

Have you seen the Nokogiri gem? http://nokogiri.org/

Very useful for xml parsing

grenierm5
  • 186
  • 4
  • 14
0

The spreadsheet gem is nice for excel and csv files. https://github.com/zdavatz/spreadsheet

aarti
  • 2,815
  • 1
  • 23
  • 31
  • I was use it and get this problem http://stackoverflow.com/questions/19915887/ruby-roo-loaderror-cannot-load-such-file-spreadsheet-note – itdxer Nov 11 '13 at 23:41