4

the record I am attempting to split is formatted like the following:

987,"This is second field, it contains multiple commas, but is enclosed in quotes, 88123, 99123", 1221, lastfield

I am using the code:

char[] delimiters = new char[] {',' };
string[] parts = line.Split(delimiters, StringSplitOptions.None);

I get split results but it does not treat the quote delimited field as one field. I need to get a result of 4 fields but am getting a field for each comma. How can I adjust/change code to get the results I need?

JBelter
  • 415
  • 2
  • 5
  • 13
  • 1
    http://stackoverflow.com/questions/4829779/splitting-a-csv-and-excluding-commas-within-elements and http://stackoverflow.com/questions/5567691/handling-commas-within-quotes-when-exporting-a-csv-file-c4-any-suggestions?rq=1 (better) – David Sherret Jul 18 '13 at 18:45

2 Answers2

0

string.Split() is not powerful enough for this. You will need to use regular expressions (in C#, the Regex class).

Tevya
  • 836
  • 1
  • 10
  • 23
0

@Tavya is correct in that String.Split is not going to work for you since it does not handle quoted lines. There are numerous ways to skin this proverbial cat including using regular expressions or by using one of the many CSV parsers that can be found with a Google search.

Another simple approach would be to use VisualBasic's TextFieldParser class. Just put a reference in your project to Microsoft.VisualBasic.dll and 'using Microsoft.VisualBasic.FileIO' in your file's header. Then you can do something like this.

    private List<string> parseFields (string lineToParse)
    {
        //initialize a return variable
        List<string> result = new List<string>();

        //read the line into a MemoryStream
        byte[] bytes = Encoding.ASCII.GetBytes(lineToParse);
        MemoryStream stream = new MemoryStream(bytes);

        //use the VB TextFieldParser to do the work for you
        using (TextFieldParser parser = new TextFieldParser(stream))
        {
            parser.TextFieldType = FieldType.Delimited;
            parser.Delimiters = new string[] { "," };
            parser.HasFieldsEnclosedInQuotes = true;
            //parse the fields
            while ( parser.EndOfData == false)
            {
                result = parser.ReadFields().ToList();
            }
        }

        return result;
    }

The results will be:

987

This is second field, it contains multiple commas, but is enclosed in quotes, 88123,
99123

1221

lastfield

dtesenair
  • 696
  • 6
  • 8