1

I need to demilitarise text by a single character, a comma. But I want to only use that comma as a delimiter if it is not encapsulated by quotation marks.

An example:

Method,value1,value2

Would contain three values: Method, value1 and value2

But:

Method,"value1,value2"

Would contain two values: Method and "value1,value2"

I'm not really sure how to go about this as when splitting a string I would use:

String.Split(',');

But that would demilitarise based on ALL commas. Is this possible without getting overly complicated and having to manually check every character of the string.

Thanks in advance

Zephni
  • 753
  • 1
  • 7
  • 26
  • 5
    Use an available csv parser like [`VisualBasic.FileIO.TextFieldParser`](https://msdn.microsoft.com/en-us/library/microsoft.visualbasic.fileio.textfieldparser(v=vs.110).aspx) or [this](http://www.codeproject.com/Articles/9258/A-Fast-CSV-Reader) or [this](http://www.filehelpers.com/). – Tim Schmelter Jun 29 '15 at 14:58
  • [Potential dupe](http://stackoverflow.com/questions/4829779/splitting-a-csv-and-excluding-commas-within-elements), but highest upvoted/accepted answer is a link only answer... (to one of the articles @Tim just edited in :)) – James Thorpe Jun 29 '15 at 15:00
  • For completion.. would you mind giving an example of usage for the TextFieldParser for my situation? So I can accept as answer. Thanks for quick responses :) – Zephni Jun 29 '15 at 15:03
  • possible duplicate of [Read csv file c# with comma separator](http://stackoverflow.com/questions/29678507/read-csv-file-c-sharp-with-comma-separator) – David Arno Jun 29 '15 at 15:04
  • possible duplicate of [Parse comma seperated string with a complication in C#](http://stackoverflow.com/questions/30078054/parse-comma-seperated-string-with-a-complication-in-c-sharp) - *The complication is quotes* – Alex K. Jun 29 '15 at 15:04
  • you can skip character. create method that replace char ',' inside '"' with something else (ex: `\u9999`) then split and finally replace skipped char with ',' – M.kazem Akhgary Jun 29 '15 at 15:09

2 Answers2

2

Copied from my comment: Use an available csv parser like VisualBasic.FileIO.TextFieldParser or this or this.

As requested, here is an example for the TextFieldParser:

var allLineFields = new List<string[]>();
string sampleText = "Method,\"value1,value2\"";
var reader = new System.IO.StringReader(sampleText);
using (var parser = new Microsoft.VisualBasic.FileIO.TextFieldParser(reader))
{
    parser.Delimiters = new string[] { "," };
    parser.HasFieldsEnclosedInQuotes = true; // <--- !!!
    string[] fields;
    while ((fields = parser.ReadFields()) != null)
    {
        allLineFields.Add(fields);
    }
}

This list now contains a single string[] with two strings. I have used a StringReader because this sample uses a string, if the source is a file use a StreamReader(f.e. via File.OpenText).

Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • Don't suppose you have any insights into why a useful class like this is buried in the VisualBasic namespace? – James Thorpe Jun 29 '15 at 15:17
  • @JamesThorpe: you are asking the wrong person. That can only be answered by someone who belonged to the VB.NET/C# compiler team. I guess that such utility classes were important for VB.NET but not for the C# guys. However, since it belongs to the .NET framework you can use it also with C# by adding the reference to the Microsoft.VisualBasic.dll. – Tim Schmelter Jun 29 '15 at 15:19
  • Yeah fair enough - always just seems odd when I see things like this in there and not "part of" the wider framework. I can understand things to bring forward VB6 style functions etc, but not stuff like this. Anyway... offtopic for here. – James Thorpe Jun 29 '15 at 15:22
  • Thank you very much, this does exactly the job I was looking for and easy to put in it's own method. :) I also wondered why it is only available in the VisualBasic namespace – Zephni Jun 29 '15 at 15:25
1

You can try Regex.Split() to split the data up using the pattern

",|(\"[^\"]*\")" 

This will split by commas and by characters within quotes.

Code Sample:

using System;
using System.Linq;
using System.Text.RegularExpressions;

public class Program
{
    public static void Main()
    {
        string data = "Method,\"value1,value2\",Method2";
        string[] pieces = Regex.Split(data, ",|(\"[^\"]*\")").Where(exp => !String.IsNullOrEmpty(exp)).ToArray();

        foreach (string piece in pieces)
        {
            Console.WriteLine(piece);
        }
    }
}

Results:

Method
"value1,value2"
Method2

Demo

Shar1er80
  • 9,001
  • 2
  • 20
  • 29