I try to parse a csv with java and have the following issue: The second column is a String (which may also contain comma) enclosed in double-quotes, except if the string itself contains a double quote, then the entire string is enclosed with a single quote. e.g.
Lines may lokk like this:
someStuff,"hello", someStuff
someStuff,"hello, SO", someStuff
someStuff,'say "hello, world"', someStuff
someStuff,'say "hello, world', someStuff
someStuff are placeholders for other elements, which can also include quotes in the same style
I'm looking for a generic way to split the lines at commas UNLESS enclosed in single OR double quotes in order to get the second column as a String. With second column I mean the fields:
- hello
- hello, SO
- say "hello, world"
- say "hello, world
I tried OpenCSV but fail as one can only specifiy one type of quote:
public class CSVDemo {
public static void main(String[] args) throws IOException {
CSVDemo demo = new CSVDemo();
demo.process("input.csv");
}
public void process(String fileName) throws IOException {
String file = this.getClass().getClassLoader().getResource(fileName)
.getFile();
CSVReader reader = new CSVReader(new FileReader(file));
String[] nextLine;
while ((nextLine = reader.readNext()) != null) {
System.out.println(nextLine[0] + " | " + nextLine[1] + " | "
+ nextLine[2]);
}
}
}
The solution with opencsv fails on the last line where there is only one double quote enclosed in single quotes:
someStuff | hello | someStuff
someStuff | hello, SO | someStuff
someStuff | 'say "hello, world"' | someStuff
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1