0

I have a question about splitting a string and put it in a DataTable. I have a textfile like below :

blabla | blabla2 | element1 | AB:CD| blabla3
blabla | blabla2 | element2 | ABC | blabla3
blabla | blabla2 | element3 | 123 | blabla3

And the desired result is like below:

enter image description here

Here is my code:

using (StreamReader file = new StreamReader(files))
{
    string line;
    while (((line = file.ReadLine()) != null))
    {
        char[] delimiter = {'|'};
        string pattern = @"\w*|'?[0-9a-zA-Z-:._]+'?";
        Regex r = new Regex(pattern, RegexOptions.Multiline);
        MatchCollection m = r.Matches(line);

        foreach (Match match in m)
        {
            DataRow row = dt.NewRow();
            string[] split = match.Groups[0].ToString().Split(delimiter);

            row[split[0]] = split[1]; // gives me an exception of index 
            limit array
            row[split[0]] = match.NextMatch().value;// gives me empty value
            dt.Rows.Add(row);
        }
    }
}

dgVResult.DataSource = dt;

But the result from my code is not as expected, and the result is like below :

enter image description here

Abbas
  • 14,186
  • 6
  • 41
  • 72
bryan tetris
  • 123
  • 1
  • 3
  • 7
  • 1
    Please just [Split](https://msdn.microsoft.com/fr-fr/library/b873y76a(v=vs.110).aspx). – Drag and Drop Apr 20 '18 at 11:44
  • 1
    That's a CSV with `|` as the delimiter. `line.Split('|')` would work although ti would create a lot of temporary strings. You could delete *all* of this code if you used a library like CsvHelper to read the file with `|` as the separator – Panagiotis Kanavos Apr 20 '18 at 11:48
  • related: https://stackoverflow.com/questions/1050112/how-to-read-a-csv-file-into-a-net-datatable . Does one have a better dupe target. – Drag and Drop Apr 20 '18 at 11:51
  • The CSV file contains one line for each column in the datatable? Is this correct? – Peter Abolins Apr 20 '18 at 11:57
  • Only one line as result? – Drag and Drop Apr 20 '18 at 12:04
  • @PeterAbolins yes i add columns manually like : dt.columns.Add("element1") etc.. – bryan tetris Apr 20 '18 at 12:04
  • is 'blabla2' an index of the line? I mean 'blabla2' is 1 rst line of result set and 'blabla3' is second line etc.? – Drag and Drop Apr 20 '18 at 12:05
  • @DragandDrop yes – bryan tetris Apr 20 '18 at 12:05
  • @bryantetris That wasn't my question. You have a CSV file with rows in it, where each row corresponds to one column in your result? If so, then the resulting datatable will have exactly one row. – Peter Abolins Apr 20 '18 at 12:06
  • no each row does not corresponds to one columns, the header never change ( element1, element2, element3) and i want to add the the next value into cell behind corresponding columns – bryan tetris Apr 20 '18 at 12:25
  • @DragandDrop yes only one line – bryan tetris Apr 20 '18 at 12:27
  • 1
    Well here is where I were when you answer my last comment, [code](https://dotnetfiddle.net/eD2FwJ). May you tell me if the input is like that or not . Because the more you answer the mor unclear it becomes. – Drag and Drop Apr 20 '18 at 12:32
  • Do not forget to [edit] those information in your question. comment are volatile by essense. – Drag and Drop Apr 20 '18 at 12:33
  • The last step is: for each x in lines, for each col in columns , add to data set. – Drag and Drop Apr 20 '18 at 12:35
  • How do you know when to start a new Row? Will the file always contain three lines of `element1`, `element2`,`element3` followed by the next 3 lines? How many lines are in the file? How many `element`s are there? Show at least 10 lines of sample data and result. – NetMage Apr 20 '18 at 18:00
  • @NetMage yes the file will always contain three lines of element1, element2, element3 followed by the three lines I think I have to stay on the regex? I do not know should not the values ​​be stored into a list? – bryan tetris Apr 20 '18 at 18:47
  • Your example has spaces around the separators - is that true? Your code doesn't reference `element1`, etc. at all - why is that? – NetMage Apr 20 '18 at 18:48
  • @NetMage Yes there is spaces, i have edited the code – bryan tetris Apr 20 '18 at 19:03
  • Do you expect the spaces to be included in the `DataTable` column names and/or the values? They aren't consistently around the delimiter. – NetMage Apr 20 '18 at 19:48
  • @NetMage no spaces do not matter I only want "element1 element2 element3 in header with the following first values I made a code to remove spaces before processing – bryan tetris Apr 20 '18 at 19:54

2 Answers2

0
string[] myenum = Enum.GetNames(typeof(allenum));
DataTable dt = new DataTable();
foreach (string enum in myenum)
{
    string[] lines = File.ReadAllLines(files);
    foreach (var line in lines)
    {
        string[] items = line.Split(new char[] {'|'}, StringSplitOptions.RemoveEmptyEntries);
        DataRow row = dt.NewRow();
        if (!line.Contains(enum)) continue; // I have an enum with "element1" "element2" and "element3"
            row[items[2]] = items[3];
        dt.Rows.Add(row);
    }
}

finally,

dgVResult.DataSource = dt;

I abandoned the regex method! but now the result is:

result

It works but I want only one row.

NetMage
  • 26,163
  • 3
  • 34
  • 55
bryan tetris
  • 123
  • 1
  • 3
  • 7
0

Since you have 3 elements, group the lines in threes using a custom extension method:

public static IEnumerable<IEnumerable<T>> GroupBy<T>(this IEnumerable<T> src, int chunkSize) {
    IEnumerable<T> NextChunk(IEnumerator<T> e, int chunkLeft) {
        do {
            yield return e.Current;
            --chunkLeft;
        } while (chunkLeft > 0 && e.MoveNext());
    }

    using (var srce = src.GetEnumerator()) {
        while (srce.MoveNext())
            yield return NextChunk(srce, chunkSize);
    }
}

To avoid the overhead of Split when you are just using a few values, I wrote an extension method to extract from the string:

public static string[] SplitExtract(this string src, string delim, int pos, int count = 1) {
    var ans = new List<string>();

    var startCharPos = 0;
    for (; pos > 0; --pos) {
        var tmpPos = src.IndexOf(delim, startCharPos + 1);
        if (tmpPos >= 0)
            startCharPos = tmpPos + delim.Length;
        else {
            startCharPos = src.Length;
            break;
        }
    }

    for (; count > 0 && startCharPos < src.Length; --count) {
        var nextDelimPos = src.IndexOf(delim, startCharPos);
        if (nextDelimPos < 0)
            nextDelimPos = src.Length;

        ans.Add(src.Substring(startCharPos, nextDelimPos - startCharPos));
        startCharPos = nextDelimPos + delim.Length;
    }
    return ans.ToArray();
}

Now you can use this with LINQ to process your file:

var Pairs = File.ReadLines(files)
                .Select(l => l.SplitExtract("|", 2, 2))
                .GroupBy(3)
                .Select(g => g.ToList());
var ColumnNames = Pairs.First().Select(p => p[0].Trim());
var dt = new DataTable();
foreach (var cn in ColumnNames)
    dt.Columns.Add(cn, typeof(string));

foreach (var r in Pairs)
    dt.Rows.Add(r.Select(p => p[1].Trim()).ToArray());
NetMage
  • 26,163
  • 3
  • 34
  • 55