I have the following code for a csv parser
string input = wholeFile;
IList<string> wholeFileArray = new List<string>();
int start = 0;
bool inQuotes = false;
for (int current = 0; current < input.Length; current++)
{
// test each character before and after to determine if it is a valid quote, or a quote within a quote.
int test_backward = (current == 0 ? 1 : current) - 1;
int test_forward = (current == input.Length - 1 ? input.Length - 2 : current) + 1;
bool valid_quote = input[test_backward] == ',' || input[test_forward] == ',' || input[test_forward] == '\r';
if (input[current] == '\"') // toggle state
{
inQuotes = !inQuotes;
}
bool atLastChar = (current == input.Length - 1);
if (atLastChar)
{
wholeFileArray.Add(input.Substring(start));
}
else if (input[current] == ',' && !inQuotes)
{
wholeFileArray.Add(input.Substring(start, current - start));
start = current + 1;
}
}
It takes a String and splits it on ,
if the ,
is not inside a double quote "something,foobar"
string like that.
My problem is that a rogue "
in my string is messing up my whole process.
EX: "bla bla","bla bla2",3,4,"5","bla"bla","End"
Result
- "bla bla"
- "bla bla2"
- 3
- 4
- "5"
- "bla"bla","End"
How do I change my code to allow for the rogue "
A 'valid' close quote is always followed by a comma (,) OR a Control Linefeed
Added This seems to fix it
// test each character before and after to determine if it is a valid quote, or a quote within a quote.
int test_backward = (current == 0 ? 1 : current) - 1;
int test_forward = (current == input.Length - 1 ? input.Length - 2 : current) + 1;
bool valid_quote = input[test_backward] == ',' || input[test_forward] == ',' || input[test_forward] == '\r';