0

I have an CSV file as below:

ID,Name,Address,PhoneNumber
101,Jack,"No 13, HillTop, London",012346789
102,Harry,"No 15, Baker Street London",012346789

I need to read all the columns (Comma separated). When I use split function it splits address as well. I want to split anything except the address which is in double quote.

user2739418
  • 1,623
  • 5
  • 29
  • 51
  • 7
    I suggest you use a CSV parsing library - I'd expect any decent library to handle this, and there's not a lot of point in you reinventing the wheel. There are plenty of CSV parsers on Nuget. – Jon Skeet Feb 08 '16 at 16:29
  • 9
    That is why String.Split is ill advised for parsing CSVs. Euro decimals is another. Use one of the many CSV Tools around. VB's TextFieldParser is built in – Ňɏssa Pøngjǣrdenlarp Feb 08 '16 at 16:30
  • 1
    There is a CSV parser called TextFieldParser in the VisualBasic namespace that you can just import and use. https://msdn.microsoft.com/en-us/library/microsoft.visualbasic.fileio.textfieldparser(v=vs.110).aspx – Jeremy Feb 08 '16 at 16:30
  • 1
    Have a look at some of the suggested libraries here: http://stackoverflow.com/questions/2081418/parsing-csv-files-in-c-sharp – Mark Feb 08 '16 at 16:31
  • 1
    Note that the `TextFieldParser` class can be used for **both** C# and VB.NET, even though it resides in the `Microsoft.VisualBasic.FileIO` namespace. – Tim Feb 08 '16 at 16:51
  • 1
    As others have suggested: don't try to do this yourself, use a CSV parser. I've had good results recently with [CsvHelper](https://www.nuget.org/packages/CsvHelper/). – Richard Ev Feb 08 '16 at 17:08

2 Answers2

3

Even though you are using C#, there is a very useful class called TextFieldParser in the Microsoft.VisualBasic namespace. You will need to add a reference to your project in addition to the using directive:

using Microsoft.VisualBasic.FileIO;

Then you can implement something similar to as follows:

private void Parse()
{
    using (TextFieldParser parser = new TextFieldParser("file.csv")
    {
        HasFieldsEnclosedInQuotes = true,
        Delimiters = new string[] {
            ","
        }
    })
    {
        string[] fields;
        do
        {
            fields = parser.ReadFields();
            PrintResults(fields);
        }
        while (fields != null);
    }
}

private void PrintResults(string[] fields)
{
    if (fields != null)
    {
        foreach (var field in fields)
        {
            Console.Write(string.Concat("[", field, "] "));
        }
        Console.WriteLine();
    }
}

The HasFieldsEnclosedInQuotes = true property of the TextFieldParser in your case must be set to achieve desired behavior.

I have placed your CSV sample data into a file and run as a test. The data I started out with was (in a local file named "file.csv"):

ID,Name,Address,PhoneNumber
101,Jack,"No 13, HillTop, London",012346789
102,Harry,"No 15, Baker Street London",012346789

And the resultant output in the console from calling the above Parse() method is:

[ID] [Name] [Address] [PhoneNumber] 
[101] [Jack] [No 13, HillTop, London] [012346789] 
[102] [Harry] [No 15, Baker Street London] [012346789] 
Lemonseed
  • 1,644
  • 1
  • 15
  • 29
  • @user2739418 Did this solution help and/or solve your problem? – Lemonseed Feb 08 '16 at 19:30
  • Thanks Dave. This was exactly I was looking for, and that to without third party (DLL) dependancy. My manager does not like to use open source DLL's unless there is no option. I was planning to find double quote and then replace comma inside double quote to # and then split and then replace back. :) Cheers – user2739418 Feb 09 '16 at 09:19
  • Awesome! Glad I could help. – Lemonseed Feb 09 '16 at 09:51
0

An alternative to using suggested parsing library, could be using REGEX.

The expression needed to parse this string, considering the "xx,xx" case is (".*"|.*?)(,|$).

A sample code using it in C#:

//preparation
var pattern = @"("".*""|.*?)(,|$)";
var regex = new Regex(pattern);

//for each file line
var text =@"101,Jack,""No 13, HillTop, London"",0123456789";
var matches = regex.Matches(text).Cast<Match>().Select(m=>m.Groups[1].Value);
tede24
  • 2,304
  • 11
  • 14
  • [and now you have two problems](http://blog.codinghorror.com/regular-expressions-now-you-have-two-problems/) ;) – Jeremy Feb 08 '16 at 17:38
  • @Jeremy, "Regular expressions are like a particularly spicy hot sauce – to be used in moderation and with restraint only when appropriate". I would prefer use a ultra simple regex here more than using a VB library – tede24 Feb 08 '16 at 21:23
  • @tede24 - If you're referring to `TextFieldParser` as a VB library, I think that's a little misleading, as it's part of the .NET framework and not exclusive to VB, despite the namespace. – Tim Feb 08 '16 at 22:03
  • @Tim may be misleading, I don't know.., but also maybe misleading referring an article talking about avoid regex to process complex structures as HTML parsing in this very simple case – tede24 Feb 08 '16 at 22:50