1

Im trying to split a string in C#. The string looks like this:

string line = "red,\"\",blue,\"green\",\"blue,orange\",,\"black\",yellow";

The result should be:

string[] result = { "red", "", "blue", "green", "blue,orange", "", "black", "yellow" };

Note that the delimiter is "," but inside double quotes it is ignored. Also note that not every substring between the delimiter is surrounded by quotes. I would like an answer where the delimiter is a string if possible. I don't mind if the double quotes are included inside the elements of the result array, like:

string[] result = { "red", "\"\"", "blue", "\"green\"", "\"blue,orange\"", "", "\"black\"", "yellow" };
Soner Gönül
  • 97,193
  • 102
  • 206
  • 364
  • 3
    Some kind of regular-expression might be able to do this, otherwise consider doing this using a simple state-machine parser. – Dai Apr 25 '14 at 07:13
  • possible duplicate of [split a comma-separated string with both quoted and unquoted strings](http://stackoverflow.com/questions/3776458/split-a-comma-separated-string-with-both-quoted-and-unquoted-strings) – lc. Apr 25 '14 at 07:18
  • Its the regular expression Im looking for because I'd like to avoid creating my own parser. – Kostas Robotis Apr 25 '14 at 07:20

1 Answers1

3

This is a 2-state machine that reads each character in the string, when it encounters a double-quote it will enter a state where it will treat every subsequent character as part of the value until it encounters another double-quote. When it's in the normal state it will form a string from each character encountered until it encounters a comma and adds it to a list of strings to return:

enum State {
    InQuotes,
    InValue
}

List<String> result = new List<String>();

using(TextReader rdr = new StringReader( line )) {

    State state = State.InValue;
    StringBuilder sb = new StringBuilder();

    Int32 nc; Char c;
    while( (nc = rdr.Read()) != -1 ) {
        c = (Char)nc;

        switch( state ) {

            case State.InValue:

                if( c == '"' ) {
                    state = State.InQuotes;
                } else if( c == ',' ) {
                    result.Add( sb.ToString() );
                    sb.Length = 0;
                } else {
                    sb.Append( c );
                }
                break;
            case State.InQuotes:

                if( c == '"' ) {
                    state = State.InValue;
                } else {
                    sb.Append( c );
                }
                break;
        } // switch
    } // while
    if( sb.Length > 0 ) result.Add( sb.ToString() );
} // using
Dai
  • 141,631
  • 28
  • 261
  • 374