Here is my code snippet which I am using:
StringWriter writer = new StringWriter();
CSVWriter csvwriter = new CSVWriter(writer);
String[] originalValues = new String[2];
originalValues[0] = "t\\est";
originalValues[1] = "t\\est";
System.out.println("Original values: " + originalValues[0] +"," + originalValues[1]);
csvwriter.writeNext(originalValues);
csvwriter.close();
CSVReader csvReader = new CSVReader(new StringReader(writer.toString()));
String[] resultingValues = csvReader.readNext();
System.out.println("Resulting values: " + resultingValues[0] +"," + resultingValues[1]);
The output of the above snippet is:
Original values: t\est,t\est
Resulting values: test,test
Back slash ('\') character is gone after conversion!!!
By some basic analysis I figured that it is happening because CSVReader
is using Back slash ('\') as default escape character where as CSVWriter
is using double quote ('"') as default escape character.
What is the reason behind this inconsistency in default behavior?
To fix above problem I managed to find following two solutions:
1) Overwriting default escape character of CSVReader with null character:
CSVParser csvParser = new CSVParserBuilder().withEscapeChar('\0').build();
CSVReader csvReader = new CSVReaderBuilder(new StringReader(writer.toString())).withCSVParser(csvParser).build();
2) Using RFC4180Parser which strictly follows RFC4180 standards:
RFC4180Parser rfc4180Parser = new RFC4180ParserBuilder().build();
CSVReader csvReader = new CSVReaderBuilder(new StringReader(writer.toString())).withCSVParser(rfc4180Parser).build();
Can using any of the above approach cause any side effects on any other characters?
Also why RFC4180Parser
is not default parser? Is it only for maintaining backward compatibility as RFC4180Parser
got introduced in later versions?