23

Trying to parse a CSV file, but still getting the error message Unquoted fields do not allow \r or \n (line 2)..

I found here at SO similar topic, where was a hint to do following:

  CSV.open('file.csv', :row_sep => "\r\n") do |csv|

but his unfortunately doesn't works me... I can't change the CSV file, so I would need to fix it in the code.

EDIT sample of CSV file:

A;B;C
1234;...

Is there any way to do it?

Many thanks!

user984621
  • 46,344
  • 73
  • 224
  • 412
  • Hi Linuxios, I updated the original post – user984621 Jul 18 '12 at 19:26
  • Did you set the record separator to `;`? – Linuxios Jul 18 '12 at 19:31
  • 1
    That example is NOT a csv file. It is a delimited text file. Similar structure, but thats not enough. Big difference. CSV = Comma-Separated Values, and besides specifying the delimiter as a comma there are other very specific data formatting rules that a csv must conform to. A delimited text file does not have to conform to these rules, though it can choose to. – Sam Axe Jul 18 '12 at 19:32
  • 1
    @Boo: Exactly. user, you'd do better using `split` and some ad hoc stuff. – Linuxios Jul 18 '12 at 19:36

9 Answers9

18

First of all, you should set you column delimiters to ';', since that is not the normal way CSV files are parsed. This worked for me:

CSV.open('file.csv', :row_sep => :auto, :col_sep => ";") do |csv|
    csv.each { |a,b,c| puts "#{a},#{b},#{c}" } 
end

From the 1.9.2 CSV documentation:

Auto-discovery reads ahead in the data looking for the next \r\n, \n, or \r sequence. A sequence will be selected even if it occurs in a quoted field, assuming that you would have the same line endings there.

jslivka
  • 361
  • 3
  • 7
  • I am a bit confused now... I am trying follow your example and when I print `puts csv`, I get this: `<#CSV io_type:File io_path:"file.csv" encoding:ASCII-8BIT lineno:0 col_sep:";" row_sep:"\n" quote_char:"\"">` - but how can I from this hash get the data? – user984621 Jul 18 '12 at 20:21
  • The open method opens an IO block, so you can do something like this for the hash: CSV.open('file.csv', :row_sep => :auto, :col_sep => ";") do |csv| csv.each { |a,b,c| puts "#{a},#{b},#{c}" } end – jslivka Jul 19 '12 at 01:41
15

Simpler solution if the CSV was touched or saved by any program that may have used weird formatting (such as Excel or Spreadsheet):

  1. Open the file with any plain text editor (I used Sublime Text 3)
  2. Press the enter key to add a new line anywhere
  3. Save the file
  4. Remove the line you just added
  5. Save the file again
  6. Try the import again, error should be gone
Mike S
  • 11,329
  • 6
  • 41
  • 76
  • This worked for me. I was on a mac and had two CSVs I had downloaded that would not work before but worked after saving. Incidientally, they both had a blank row at the top of the file. Not sure if it was deleted that row or saving the file that fixed it for me. Regardless, thank you! – grant Oct 13 '15 at 16:09
  • No idea what this did but worked. Must have trimmed trailing whitespace characters. – Apoorv Parijat Nov 07 '15 at 03:04
  • this helped me, had some highlighting in my csv, while the above fix did not work i cleared formatting and resaved to csv from google sheets – madav Feb 13 '19 at 12:32
  • My lord what kind of CSV parser can't handle a carriage return smh – duhaime May 15 '23 at 22:48
3

For me I was importing LinkedIn CSV and got the error.

I removed the blank lines like this:

  def import
    csv_text = File.read('filepath', :encoding => 'ISO-8859-1')
    #remove blank lines from LinkedIn
    csv_text = csv_text.gsub /^$\n/, ''
    @csv = CSV.parse(csv_text, :headers => true, skip_blanks: true)
  end
David Silva Smith
  • 11,498
  • 11
  • 67
  • 91
2

In my case I had to provide encoding, and a quote char that was guaranteed to not occur in data

CSV.read("file.txt", 'rb:bom|UTF-16LE', {:row_sep => "\r\n", :col_sep => "\t", :quote_char => "\x00"})
Danil Gaponov
  • 1,413
  • 13
  • 23
1

I realize this is an old post but I recently ran into a similar issue with a badly formatted CSV file that failed to parse with the standard Ruby CSV library.

I tried the SmarterCSV gem which parsed the file in no time. It's an external library so it might not be the best solution for everyone but it beats parsing the file myself.

opts = { col_sep: ';', file_encoding: 'iso-8859-1', skip_lines: 5 }
SmarterCSV.process(file, opts).each do |row|
  p row[:someheader]
end
Cimm
  • 4,653
  • 8
  • 40
  • 66
1

Please see this thread Unquoted fields do not allow \r or \n

Solution:

file = open(file.csv).read.gsub!("\r", '')
CSV.open(file, :row_sep => "\r\n") do |csv|
DirkAVA
  • 33
  • 6
0

In my case, the first row of the spreadsheet/CSV was a double-quoted bit of introduction text. The error I got was: /Users/.../.rvm/rubies/ruby-2.3.0/lib/ruby/2.3.0/csv.rb:1880:in `block (2 levels) in shift': Unquoted fields do not allow \r or \n (line 1). (CSV::MalformedCSVError)

I deleted the comment with " characters so the .csv ONLY had the .csv data, saved it, and my program worked with no errors.

Michael
  • 21
  • 7
0

If you have to deal with files coming from Excel with newlines in cells there is also a solution.

The big disadvantage of this way is, that no semicolons or no double quotes in strings are allowed.

I choose to go with no semicolons

if file.respond_to?(:read)
  csv_contents = file.read
elsif file_data.respond_to?(:path)
  csv_contents = File.read(file.path)
else
  logger.error "Bad file_data: #{file_data.class.name}: #{file_data.inspect}"
  return false
end

result = "string"
csv_contents = csv_contents.force_encoding("iso-8859-1").encode('utf-8') # In my case the files are latin 1...

# Here is the important part (Remove all newlines between quotes):
while !result.nil?
  result = csv_contents.sub!(/(\"[^\;]*)[\n\r]([^\;]*\")/){$1 + ", " + $2}
end

CSV.parse(csv_contents, headers: false, :row_sep => :auto, col_sep: ";") do |row|
  # do whatever
end

For me the solution works fine, if you deal with large files you could run into problems with it.

If you want to go with no quotes just replace the semicolons in the regex with quotes.

Markus Andreas
  • 935
  • 1
  • 13
  • 12
-4

Another simple solution to fix the weird formatting caused by Excel is to copy and paste the data into Google spreadsheet and then download it as a CSV.

Steven Yap
  • 188
  • 6