I am writing a java application which reads CSV from the standard input. However, I found that I have some troubles in deal with the double quotes.
For example, if i read in a text:
"He said, ""What?"""
the output gives me:
field[0] = `He said, What?"""'
The last two quotes are what I don't want.
Here is my code:
public class Csv{
private BufferedReader fin;
private String fieldsep;
private ArrayList field;
public Csv(){
this(System.in, ",");
}
public Csv(InputStream in, String sep){
this.fin = new BufferedReader(new InputStreamReader(in));
this.fieldsep = sep;
}
// getline: get one line, grow as needed
public String getline() throws IOException {
String line;
line = fin.readLine();
if (line == null)
return null;
field = split(line, fieldsep);
return line;
}
// split: split line into fields
private static ArrayList split(String line, String sep){
ArrayList list = new ArrayList();
int i, j;
if (line.length() == 0)
return list;
i = 0;
do {
if (i < line.length() && line.charAt(i) == '"') {
StringBuffer field = new StringBuffer();
j = advquoted(line, ++i, sep, field);
list.add(field.toString());
}
else {
j = line.indexOf(sep, i);
if (j == -1)
j = line.length();
list.add(line.substring(i, j));
}
i = j + sep.length();
} while (j < line.length());
return list;
}
// advquoted: quoted field; return index of next separator
private static int advquoted(String s, int i, String sep, StringBuffer field){
field.setLength(0);
for ( ; i < s.length(); i++) {
if (s.charAt(i) == '"' && ++i < s.length() && s.charAt(++i) != '"') {
int j = s.indexOf(sep, i);
if (j == -1)
j = s.length();
field.append(s.substring(i, j));
i = j;
break;
}
field.append(s.charAt(i));
}
return i;
}