0

Here is my code that splits my string array using delimited characters, but does not consider the issue in my title:

        char[] delimitedChars = { ',', '\n', '"' };
        words = stringamabob.Split(delimitedChars);

I want this all to be true EXCEPT I don't want the comma to be a delimited character when it is inbetween quotation marks.

For example, if I had:

stringamabob = one, two, three, "four, five", six

I would get:

words [0] = one

words [1] = two

words [2] = three

words [3] = four

words [4] = five

words [5] = six

Where as I want to get:

words [0] = one

words [1] = two

words [2] = three

words [3] = four, five

words [4] = six

user2340818
  • 101
  • 5
  • 1
    You may have to do two separate splits. First split it by quotations, and create those elements in one array, and take them out of the original one. Then split by the comma in a second array. Then combine the two arrays into your result. – Kat May 28 '14 at 15:35
  • Not a bad idea, but everything has to stay in order. – user2340818 May 28 '14 at 15:42
  • This is a duplicate of every question about parsing CSV's: http://stackoverflow.com/questions/8112024/splitting-text-based-on-comma – arserbin3 May 28 '14 at 15:49
  • Also the question linked to in the linked question above: http://stackoverflow.com/q/769621/395718 – Dialecticus May 28 '14 at 15:54
  • 1
    .NET has a [TextFieldParser](http://msdn.microsoft.com/en-us/library/microsoft.visualbasic.fileio.textfieldparser.aspx) built in. You might want to look into this class. – Icemanind May 28 '14 at 16:17

2 Answers2

1

Try this, it won't work if you have quotes nested inside each other (which is rare), but it should work in all other cases.

string[] quotesplit = stringamabob.Split('"'); //Split by quotes.
char[] delimitedChars = { ',', '\n'}; //remove quotes from these delimiters because we've already split by them
List<string> words = new List<string>();
bool toggle = stringamabob.StartsWith("\""); //check if the first item is quoted
foreach(string chunk in quotesplit)
{
    if(toggle) //toggle is true when we're not inside quotes
    {
        words.AddRange(chunk.Split(delimitedChars));
    }
    else
    {
        words.Add(chunk);
    }
    toggle = !toggle;
}
Nick Udell
  • 2,420
  • 5
  • 44
  • 83
1

A regular expression like this seems to work:

"(.*)"|(\S*),|(\S*)$

As this rubular exhibits

You will end up with a match in group 1 (quotes) or group 2 (comma) or group 3 (end of line)

Justin Pihony
  • 66,056
  • 18
  • 147
  • 180