3

Is there a way to read Excel 97-2003 files from Ruby?

Background

I'm currently using the Ruby Gem parseexcel -- http://raa.ruby-lang.org/project/parseexcel/ But it is an old port of the perl module. It works fine, but the latest format it parses is Excel 95. And guess what? Excel 2007 will not produce the Excel 95 format.

John McNamara has taken over duties as the maintainer for the Perl Excel parser, see http://metacpan.org/pod/Spreadsheet::ParseExcel The current version will parse Excel 95-2003 files. But is there a port to Ruby?

My other thought is to build some Ruby to Perl glue code to enable use of the Perl library itself from Ruby. Eg, see What's the best way to export UTF8 data into Excel?

(I think it would be much faster to write the glue code than to port the parser.)

Thanks,

Larry

Community
  • 1
  • 1
Larry K
  • 47,808
  • 15
  • 87
  • 140

6 Answers6

8

I'm using spreadsheet, give it a shot.

khelll
  • 23,590
  • 15
  • 91
  • 109
  • I'm using spreadsheet for Excel generation and it works great. Haven't had much exposure to the parsing side. – Michael Sepcot Oct 16 '09 at 19:59
  • I use the same - especially after switching from Windows Server to Ubuntu Server 8) – Reuben Mallaby Nov 26 '09 at 18:24
  • 1
    The "spreadsheet" gem is licensed under the terms of the GNU GPL v3. It might not matter for your particular use-case, but it's easy to overlook. (ParseExcel is LGPL, and "roo" is under licensed under the same terms as Ruby.) – RJHunter Sep 18 '12 at 06:42
3

There is also roo: http://roo.rubyforge.org/

Tom Huras
  • 31
  • 1
3

In my experience spreadsheet works much faster than roo, however roo can support the .xlsx format which spreadsheet cannot.

1

As khell mentioned, spreadsheet is a great tool. See my code below that I used to build a crawler.

require 'find'
require 'spreadsheet'
Spreadsheet.client_encoding = 'UTF-8'

count = 0

Find.find('/Users/toor/crawler/') do |file|             # begin iteration of each file of a specified directory
  if file =~ /\b.xls$\b/                                # check if a given file is xls format
    workbook =  Spreadsheet.open(file).worksheets       # creates an object containing all worksheets of an excel workbook
    workbook.each do |worksheet|                        # begin iteration over each worksheet
      worksheet.each do |row|                           # begin iteration over each row of a worksheet
        if row.to_s =~ /regex/                          # rows must be converted to strings in order to match the regex
          puts file
          count += 1
        end
      end
    end
  end
end

puts "#{count} pieces of information were found"
Anconia
  • 3,888
  • 6
  • 36
  • 65
0

I've not tried to parse Excel files before, but I know FasterCSV is a great library for parsing CSV files (which Excel can produce).

zgchurch
  • 2,302
  • 1
  • 14
  • 7
0

In the case that you are Windows,
you can always use WIN32OLE.

Have a look at http://rubyonwindows.blogspot.com/search/label/excel