0

I'm treating csv files using a semicolon as a delimiter and I'm currently using String[] splitLine = line.split(";", -1);

I recently discovered that some genius thought it was a good idea to send data like this :

id;status;comment
123;OK;
134;KO;"bad read; try again"

Apparently, Excel supports delimiters inside quotes like this, so maybe it's allowed in CSV, but that completely messes up my split. How can I change my code so that it's split in the correct number of columns? Either during the split, or cleaning it before, changing the ; to a , for example.

By the way, there can be quotes in the comment field and they shouldn't be removed. For instance :

145;OK;the line "abcd" was malfunctioning
Teleporting Goat
  • 417
  • 1
  • 6
  • 20
  • 3
    maybe using a library already tested and maintained?.. for example https://commons.apache.org/proper/commons-csv/ – Alberto Sinigaglia May 31 '21 at 11:08
  • however this (`134;KO;"bad read; try again"`) is no ambiguous, because is can be either `[134, KO, "bad read, try again"]` (4 elements) and `[134, KO, "bad read; try again"]` (3 elements) – Alberto Sinigaglia May 31 '21 at 11:10
  • so there is definitely a problem on the client side, because at lease you have to escape the delimiter, for example like `134;KO;"bad read\; try again"` so that you know that that `;` is not a delimiter, but is part of a element – Alberto Sinigaglia May 31 '21 at 11:11
  • @Berto99 Yes, I think using a library is the best choice overall. And yes, I know there's an ambiguity problem, but it's not exactly the client's fault. If a library can parse quoted delimiters, it means it's allowed in CSV. – Teleporting Goat May 31 '21 at 13:41

0 Answers0