0

I have a web application that parse users uploaded csv files.

Some users upload csv files don't match proper csv format mentioned here

For example:

abc,hello mahmoud,this is" description, bad

This should be

abc,hello mahmoud,"this is"" description", bad

When I used ruby fastercsv library to parse the wrong csv, it fails. However, it success when I open the file by excel or openoffice.

Is there any ruby library can reformat the csv text to put it in a proper format?

Mahmoud Khaled
  • 6,226
  • 6
  • 37
  • 42

1 Answers1

2

From the docs:

What you don‘t want to do is feed FasterCSV invalid CSV. Because of the way the CSV format works, it‘s common for a parser to need to read until the end of the file to be sure a field is invalid. This eats a lot of time and memory.

Luckily, when working with invalid CSV, Ruby‘s built-in methods will almost always be superior in every way. For example, parsing non-quoted fields is as easy as:

data.split(",")

This would give you an array. If you really want valid CSV (f.e. because you rescued the MalformedCSVError) then there is... fasterCSV!

require 'csv'
str= %q{abc,hello mahmoud,this is" description, bad}
puts str.split(',').to_csv 
#=> abc,hello mahmoud,"this is"" description", bad
steenslag
  • 79,051
  • 16
  • 138
  • 171
  • this fixes the unquoted problem but what if the csv file is malformed for another reason that can be fixed by openoffice also? Fore example: http://stackoverflow.com/questions/9098759/how-to-use-ruby-gsub-regexp-with-many-matches/9099705#9099705 Is there a generic solution for all these problems? – Mahmoud Khaled Feb 06 '12 at 09:49
  • just for clarification. fastercsv was an external library in ruby 1.8 and was then included as the csv standard library in ruby 1.9. – froderik Feb 06 '12 at 21:30