0

I need to split text by "comma seperater"... and "string identifier"

input "dtl.txt"

AWD_CODE,AWD_NAME,AWD_TYPE,ADF_REF,FLG_SUM,FLG
DMM,PETCH,01,REF 2/2015,,
TRR,TUCTH,01,REF 2/2015,WD_TRK,F
TGC,DHYTH,02,REF 3/2015,"WD_TRK,WD_TRI",F

operation

  static void Main(string[] args)
        {
            string[] lines = System.IO.File.ReadAllLines(@"D://dtl.txt");

            List<string[]> param = new List<string[]>();

            foreach(string line in lines)
            {
                param.Add(line.Split(','));
            }

            var x = param; // for debug
        }

output (get)

array : 
[0] : "AWD_CODE","AWD_NAME","AWD_TYPE","ADF_REF","FLG_SUM","FLG"
[1] : "DMM","PETCH","01","REF 2/2015","",""
[2] : "TRR","TUCTH","01","REF 2/2015","WD_TRK","F"
[3] : "TGC","DHYTH","02","REF 3/2015","\"WD_TRK","WD_TRI\"","F"

output (need)

array : 
[0] : "AWD_CODE","AWD_NAME","AWD_TYPE","ADF_REF","FLG_SUM","FLG"
[1] : "DMM","PETCH","01","REF 2/2015","",""
[2] : "TRR","TUCTH","01","REF 2/2015","WD_TRK","F"
[3] : "TGC","DHYTH","02","REF 3/2015","WD_TRK,WD_TRI","F"

"WD_TRK,WD_TRI" yes that code split it too.

But i not need , can anyone help solve this problem ?

Ian
  • 30,182
  • 19
  • 69
  • 107

2 Answers2

1

This is the situation where TextFieldParser in the Microsoft.VisualBasic.FileIO library is best fit.

using Microsoft.VisualBasic.FileIO; //add this

static void Main(string[] args)
{
    string text = System.IO.File.ReadAllText(@"D://dtl.txt"); //note this

    List<string[]> param = new List<string[]>();
    string[] words; //add intermediary reference

    using (TextFieldParser parser = new TextFieldParser(new StringReader(text))) {
        parser.Delimiters = new string[] { "," }; //the parameter must be comma
        parser.HasFieldsEnclosedInQuotes = true;
        while ((words = parser.ReadFields()) != null)
            param.Add(words);
    }

    var x = param; // for debug
}

And you shall get what you need. Read this.

Output:

array : 
[0] : "AWD_CODE","AWD_NAME","AWD_TYPE","ADF_REF","FLG_SUM","FLG"
[1] : "DMM","PETCH","01","REF 2/2015","",""
[2] : "TRR","TUCTH","01","REF 2/2015","WD_TRK","F"
[3] : "TGC","DHYTH","02","REF 3/2015","WD_TRK,WD_TRI","F"

To use it, you need to include Microsoft.VisualBasic in your reference.

Ian
  • 30,182
  • 19
  • 69
  • 107
  • No problem. Glad that I can be of any help. ;) I got stuck here recently too, though the problem is slightly different, and so the answer to my problem was `Regex`. But I noticed that the `TextFieldParser` would suit you here. This was my problem: http://stackoverflow.com/questions/34607051/parse-string-with-whitespace-and-quotation-mark-with-quotation-mark-retained – Ian Jan 07 '16 at 04:13
  • Because i don't sure about regex (i'm noob for it). And i think it so difficult for read. On my oppinion, I trust microsoft library. :) –  Jan 07 '16 at 04:19
  • O yes. All I am saying is `Regex` fits best for *my* problem. For *yours*, I believe this tool suits best. ;) – Ian Jan 07 '16 at 04:21
0

Unless you use a specialized CSV library in this particular case (highly recommended), then you will need to write a regular expression. See C#, regular expressions : how to parse comma-separated values, where some values might be quoted strings themselves containing commas for a similar question. The regular expression given there was

"[^"\r\n]*"|'[^'\r\n]*'|[^,\r\n]*

with this code to execute it:

Regex regexObj = new Regex(@"""[^""\r\n]*""|'[^'\r\n]*'|[^,\r\n]*");
Match matchResults = regexObj.Match(input);
while (matchResults.Success) 
{
    Console.WriteLine(matchResults.Value);
    matchResults = matchResults.NextMatch();
}
Community
  • 1
  • 1
jnwood
  • 91
  • 5