2

I am trying to add changes data in a csv file:

This is the sample data:

DATE        status       code                       value     value2
"2016-01-26","Subscription All","119432660","1315529431362550","0.0080099833517888"
"2016-01-26","Subscription All","119432664","5836995058433524","0.033825584764444"
"2016-01-26","Subscription All","119432664","8287300074499777","0.076913377834744"
"2016-01-26","Subscription All","119432664","14870697739968326","0.0074188355187426"

My code used to format the data:

    CSVReader reader = new CSVReader(new FileReader(new File(fileToChange)), CSVParser.DEFAULT_SEPARATOR, CSVParser.NULL_CHARACTER, CSVParser.NULL_CHARACTER, 1)
    info "Read all rows at once"
    List<String[]> allRows = reader.readAll();

    CSVWriter writer = new CSVWriter(new FileWriter(fileToChange), CSVWriter.DEFAULT_SEPARATOR, CSVWriter.NO_QUOTE_CHARACTER)
    writer.writeAll(allRows)
    writer.close()

The output i get is this, with extra quote added instead of removing it.

""2016-01-26"",""Subscription All"",""119432660"",""1315529431362550"",""0.0080099833517888""
""2016-01-26"",""Subscription All"",""119432664"",""5836995058433524"",""0.033825584764444""
""2016-01-26"",""Subscription All"",""119432664"",""8287300074499777"",""0.076913377834744""
""2016-01-26"",""Subscription All"",""119432664"",""14870697739968326"",""0.0074188355187426""

I want to remove the quotes. Please can someone help.

Also, is it possible to change the date format to yyyymmdd instead of yyyy-mm-dd?

   allRows.each { String[] theLine ->
        String newDate = theLine[0].replaceAll('-', '')
        String newline = theLine.eachWithIndex { String s, int i -> return i > 0 ? s : newDate}
        writer.writeLine(newline)
    }

Thanks

user175084
  • 4,550
  • 28
  • 114
  • 169
  • 1
    It seem to be not a bug, csv reader reads qoute character as part of the string, and writer outputs it as part of the value, but also esca[es as double qoute. Looke here for more info: http://stackoverflow.com/questions/13969254/unwanted-double-quotes-in-generated-csv-file – Dzmitry Paulenka Feb 01 '16 at 21:02
  • 1
    I would +1 but seem to be out of votes for the day... sigh! – Jim Garrison Feb 01 '16 at 21:15

3 Answers3

3

When you instantiated your CSVReader you told it to treat no characters as quotes, therefore it read the existing quotes as data and did not remove them.

When you told CSVWriter not to add any quotes it honored your request. However, the input data contained quote characters, and the convention for including quotes inside a string in CSV is to double the quotes. Thus the string value

ABC"DEF

gets coded in CSV as

"ABC""DEF"

So the result you see is the combination of not removing the quotes on input (you told it not to) and then doubling the quotes on output.

To solve this change the input option from NULL_CHARACTER to DEFAULT_QUOTE_CHARACTER. However be aware that if any of your data actually contains embedded quotes or commas the resulting output will not be valid CSV.

Also I think this might be a valid bug report against OpenCSV. I believe that OpenCSV needs to inform you if it is about to generate invalid CSV when you told it to omit quotes, probably via a runtime exception. Although I suppose they might argue that you chose to work without a net and should accept whatever you get. Personally I go for the "principle of least surprise", which IMHO would be not to double quotes when the output is unquoted.

Jim Garrison
  • 85,615
  • 20
  • 155
  • 190
2

Because quotation in your CSVReader is set to CSVParser.NULL_CHARACTER " is treated as normal character which is part of read token. This causes your array to contain data in form:

["2016-01-26", "Subscription All", "119432660", "1315529431362550", "0.0080099833517888"]

rather than:

[2016-01-26, Subscription All, 119432660, 1315529431362550, 0.0080099833517888]

So try changing option from CSVParser.NULL_CHARACTER to either

  • '"'
  • CSVParser.DEFAULT_QUOTE_CHARACTER (it also stores '"').
Pshemo
  • 122,468
  • 25
  • 185
  • 269
  • Thanks for the help. Is it possible to take a look with whats wrong with my date formatting? – user175084 Feb 01 '16 at 21:36
  • 1
    @user175084 It looks like you may be looking for: `allRows.forEach(row -> row[0] = row[0].replace("-", ""));` – Pshemo Feb 01 '16 at 22:07
0
CsvToBean csvToBean = new CsvToBeanBuilder(new StringReader(csv))
                .withMappingStrategy(strategy)
                .withIgnoreLeadingWhiteSpace(true)
                .withSeparator(',')
                .withIgnoreEmptyLine(true)
                .withQuoteChar('\'')
                .withQuoteChar('"')
                .build();