If you can be sure the input is well formed (does not have unbalanced quotes), then this works (and if it's not well formed, then what do you want to do?):
"(([^"]*?)((""[^"]*?)*?))"(?!")
It is a quote, followed by anything but a quote zero or more times, followed any number of groups consisting of a pair of double quotes followed by any number of non-quotes, and ending with a quote not followed by a quote.
If you're sure that each data ends with a ";
then it gets a little easier
"(([^"]*?)((""[^"]*?)*?))";
but does the last one on the line end with a ";
or just a quote?
With inspiration from JoelFan and OldCurmudgeon, this works and is a bit simpler:
"((?:[^"]|"")*)"
With each pattern, the data is in capturing group 1. So your code would be something like:
while (matcher.find()) {
data = matcher.group(1);
/* do whatever you want with the data such as replace '""' with '"' */
}
Of course, you have to escape the quotes in the patterns when writing them as Java Strings, so they end up looking like this in your code:
"\"(([^\"]*?)((\"\"[^\"]*?)*?))\"(?!\")"
or
"\"(([^\"]*?)((\"\"[^\"]*?)*?))\";"
or (what I would use in my code)
"\"((?:[^\"]|\"\")*)\""