I am planning to modify the file format so that each field should be enclosed in by double quotes mandatory "A","Field1","Field2","Field3","Fi"el,d","Fi""eld"
, I want the separator to be combined i.e to be ", (double quotes followed by comma) how do i change the below split command to include two separator ", (double quote and comma) together line.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)",15);
-
2What language is this? Update: looking at other questions the poster has asked, I'd guess it's Java. – Mark Byers Feb 17 '10 at 22:54
-
1Dupes from same user: http://stackoverflow.com/questions/2277476/regarding-java-split-command-csv-file-parsing and http://stackoverflow.com/questions/2241915/regarding-java-string-manipulation/2241950#2241950 Please stick to **one** user account alltime and **one** topic per question/problem. – BalusC Feb 17 '10 at 23:00
-
Thanks a lot for your info. I gave my response but it didn't get an reply. So I created this thread. I am creating the questions under one user account not sure how it can be more than one account. – Arav Feb 17 '10 at 23:43
-
And an earlier one: http://stackoverflow.com/questions/2241758/regarding-java-split-command-parsing-csv-file – finnw Feb 18 '10 at 01:01
-
@arav: you got an answer to use a real CSV parser instead. – BalusC Feb 18 '10 at 01:45
-
I have 5000 records in the file. I am wondering processing will be slow if parse character by character. If i have no way of solving my issue then i have to use csv parser that you posted earlier – Arav Feb 18 '10 at 04:31
2 Answers
how do i change the below split command to include two separator ", (double quote and comma)
This would do it:
line.split("\",");
You'd need to trim the extra quotes that aren't removed by the split. You could also consider splitting on "\",\""
instead.
However, instead of reinventing the wheel, I'd suggest that you try to find an existing CSV reader for your platform. It will be better and faster and a lot less work.

- 811,555
- 193
- 1,581
- 1,452
-
Thanks a lot. I will try this one. Comma was creating problems when double quotes are in data. so i want to use two separators combined. You responded "You'd need to trim the extra quotes that aren't removed by the split" . I am not getting this. Are you denoting the last field in the line? – Arav Feb 17 '10 at 23:47
-
I just wrote an answer suggesting a few CSV libraries, then noticed that Mark had already suggested using an existing CSV library. SuperCSV looks pretty good at a glance, but there are at least 4 others which should also do the job. – rob Feb 18 '10 at 00:08
In our application we also supported comma-separated files for years. All went well until customers started to add double quotes into strings. We solved that problem by also allowing the values to be embedded in single quotes (and not allowing single quotes between double quotes, or double quotes between single quotes), but then customers wanted to add both single and double quotes in strings, or couldn't generate this file in an easy way anymore because the embracing characters depended on the values.
Then we started supporting backslashes, but things only became worse.
We finally solved the problem by using TAB as separator (instead of comma). TAB's never appear in string values. No quotes needed anymore. Problem solved.

- 23,217
- 12
- 67
- 130
-
Thanks a lot. Already some of the systems developed. so i can't change the delimiter now. – Arav Feb 17 '10 at 23:47