6

I'm trying to upload and parse an .XLS file. I've tried with roo-xls and Spreadsheet, but with both, I'm getting this error:

Ole::Storage::FormatError in UploadController#upload
OLE2 signature is invalid

I found a few resources about this error, and the only answer ever given is to re-save the document as .XLS, because although the original is labeled as .XLS, it actually isn't.

Unfortunately this isn't really an option, because I have users uploading files, so it's important that it work without requiring a re-save.

For the record, I tried re-saving the file and now it works, but I'm at a loss as to why it works, because the file's format before and after the re-save looks to be the exact same. This is what was listed under "type of file" in Excel, before and after:

Microsoft Excel 97-2003 Worksheet (.xls)"

And this is what is listed under "Type" in Libreoffice, before and after:

Microsoft Excel Worksheet (application/vnd.ms-excel)

What's going on?

Also, here's my simple upload code:

form

<%= form_tag(upload_path, multipart: true) do %>
  <%= file_field_tag :file %>
<% end %>

controller

file = params[:file].path

#Roo Attempt
doc = Roo::Excel.new(file)

#Spreadsheet Attempt
require 'spreadsheet'
Spreadsheet.client_encoding = 'UTF-8'
doc = Spreadsheet.open(file).worksheets
Joe Morano
  • 1,715
  • 10
  • 50
  • 114

2 Answers2

2

The issue is that roo reads only:

  • Excel 2007 - 2013 formats (xlsx, xlsm)
  • LibreOffice / OpenOffice.org formats (ods)
  • CSV

For xls only you need to use roo-xls gem.

If you need any details further, I would need a copy of the excel file.

you should have:

require 'roo'
require 'roo-xls'

Then it will work.

tukan
  • 17,050
  • 1
  • 20
  • 48
  • My mistake, I'm using roo-xls but I neglected to say that in the question. Unfortunately the excel file is proprietary, but I'll try to recreate the error with a non-proprietary file. – Joe Morano Sep 20 '18 at 16:18
  • @JoeMorano please do so, nobody has crystal ball :). – tukan Sep 20 '18 at 17:44
  • @JoeMorano Also it would help to have minimal working example - https://stackoverflow.com/help/mcve – tukan Sep 21 '18 at 11:39
  • @JoeMorano do you have both gems in the require? Only the `roo-xls` gem is not enough! – tukan Sep 24 '18 at 16:35
  • I do have both gems required. I've been trying to create another .xls file that causes the same error, but I can't...the original file was generated by third-party POS software. I can't share it in its present form because it's not my intellectual property, but if I make any changes and save it, it no longer causes the error. If you had access to it, how would you analyze it? Would you check the metadata or binary or something like that? – Joe Morano Sep 27 '18 at 05:32
  • @JoeMorano I know it can be hard to reproduce the error (could you do an extraction with some dummy data, from some testing platform?). First I would check if the file is actually a correctly created `*.xls` file (perhaps a `*.csv` file named `*.xls`?). From your post it appears that it is not. I would view it with a text file editor to see what is actually inside. Then I would try to extract meta-data via some extractor like - http://meta-extractor.sourceforge.net/ to see if there are any discrepancies. – tukan Sep 27 '18 at 08:27
  • @JoeMorano I have found out that there is even a ruby way to read the meta data using *yomu*- https://github.com/yomurb/yomu. You may try that. – tukan Sep 28 '18 at 08:14
1

Here's a suggestion. Might not be pretty, but should work, assuming the differences are the same for any file with the problem.

  1. Manually perform a binary compare of the file before and after it's 're-saved'. Note the differences.
  2. Read the uploaded file from the user. See if it's missing the noted difference from earlier, and modify the file accordingly.
  3. Send the 'modified' file to roo-xls
Rots
  • 5,506
  • 3
  • 43
  • 51