1

I have groups of data stored in a database table that I want to transform into a list of group-struct, each containing arrays of data.

I am able to do this using a rather long method. I was wondering if there is a more compact way of achieving this? I suspect Linq is supposed to be perfect for this kind of operation but I really don't know how to start.

The following example illustrates what I am currently doing. My real data is more complex.

The data is to be stored in a struct like this

public struct GroupData
{
    private string aString;
    private int anInt;

    public GroupData(string aString, int anInt)
    {
        this.aString = aString;
        this.anInt = anInt;
    }
}

Which again is to be stored in a Group-struct

public struct Group
{
    private string groupId;
    private GroupData[] groupData;

    public Group(string groupId, GroupData[] groupData)
    {
       this.groupId = groupId;
       this.groupData = groupData;
    }
}

I am currently doing this

//Create some dummy data
DataTable table = new DataTable();
table.Columns.Add("GROUP_ID", typeof(string));
table.Columns.Add("A_STRING", typeof(string));
table.Columns.Add("AN_INT", typeof(int));

table.Rows.Add("A", "this_is_A2", 7);
table.Rows.Add("A", "this_is_A2", 4);
table.Rows.Add("B", "this_is_B1", 3);
table.Rows.Add("C", "this_is_C1", 1);
table.Rows.Add("D", "this_is_D1", 3);
table.Rows.Add("D", "this_is_D2", 2);

//Create list of groups with arrays of groupdata
string theString, theGroupId;
int theInt;
List<Group> theGroups = new List<Group>();
List<GroupData> groupDataList;
Dictionary<string, List<GroupData>> groupDataDict = new Dictionary<string, List<GroupData>>();

//Read all rows and convert to structs
for (int i = 0; i < table.Rows.Count; i++)
{
    theGroupId = (string)table.Rows[i]["GROUP_ID"];
    theString = (string)table.Rows[i]["A_STRING"];
    theInt = (int)table.Rows[i]["AN_INT"];

    //Collect all GroupData into their respective Groups
    if (!groupDataDict.TryGetValue(theGroupId, out groupDataList))
    {
        groupDataList = new List<GroupData>();
        groupDataDict.Add(theGroupId, groupDataList);
    }
    groupDataList.Add(new GroupData(theString, theInt));
}

//Insert each group into the list
foreach (KeyValuePair<string, List<GroupData>> kv in groupDataDict)
    theGroups.Add(new Group(kv.Key, kv.Value.ToArray()));

I see that my question is closely related to this post Group by in LINQ and I guess I could first transform my datatable to a list, and then use the prescribed method. But Ideally I would like to bypass the step of first transforming to a list and operate directly on the DataTable.

mortysporty
  • 2,749
  • 6
  • 28
  • 51

1 Answers1

1

While I would highly recommend switching from DataTable to using models as most modern applications do (see references below), Microsoft has provided an AsEnumerable() extension method in System.Data.DataTableExtensions that is all you need to use LINQ on a DataTable.

//Create some dummy data
DataTable table = new DataTable();
table.Columns.Add("GROUP_ID", typeof(string));
table.Columns.Add("A_STRING", typeof(string));
table.Columns.Add("AN_INT", typeof(int));

table.Rows.Add("A", "this_is_A2", 7);
table.Rows.Add("A", "this_is_A2", 4);
table.Rows.Add("B", "this_is_B1", 3);
table.Rows.Add("C", "this_is_C1", 1);
table.Rows.Add("D", "this_is_D1", 3);
table.Rows.Add("D", "this_is_D2", 2);

var groups = (from dt in table.AsEnumerable()
              group dt by dt.Field<string>("GROUP_ID") into g
              select new { 
                  GroupID = g.Key, 
                  GroupData = g.Select(i => i.Field<int>("AN_INT")) }
              ).ToList();

Reference: LINQ query on a DataTable

A few articles you should read before you decide to use DataTable in the 2020's (keep in mind it is nearly two decades old and is not type safe):

NightOwl888
  • 55,572
  • 24
  • 139
  • 212
  • Hi. Ok thanks. Im not updated on "models" but I'll google it :) – mortysporty Jan 19 '18 at 11:46
  • why do you say "highly recommend switching from DataTable to using models as most modern applications do" could you expand on this. – Seabizkit Dec 10 '19 at 06:44
  • @Seabizkit - `DataTable` is not very efficient as far as memory and performance are concerned. While it is not deprecated, it has been around since .NET 1.1. It is still useful for one-off tasks where performance isn't as much of a concern as ease of use, but using [POCO](https://en.wikipedia.org/wiki/Plain_old_CLR_object) object models are the preferred way to deal with data in front line applications. – NightOwl888 Dec 10 '19 at 17:52
  • @NightOwl888 HI could you should me something which backs up your statement, about memory and performance. – Seabizkit Dec 11 '19 at 07:35
  • @NightOwl888 i appreciate the update, but i had already read those, and **a lot of others**, i think its more important how you use it, or what you using it for, rather than saying its better at either memory or performance, this is strictly not true, or well from my reading and my understanding... well it depends on what you are attempting to do. I just thought i would point this out as I've been seeing a lot of "hate" on data-table, but without understanding pro's and con's it really depend on what you trying to do. example...write a class which can wrap around an unknown column type. – Seabizkit Dec 11 '19 at 18:24
  • @Seabizkit - DataTable is useful in the right situations. Only if the performance impact & lack of type safety are not deal breakers for the usage. It is often abused, which is why there are so many "hate" posts about it. It is a lot of work to refactor an application that used a DataTable into one with object models, and it is often necessary because DataTable is a bottleneck. *Generics* can be used to wrap around unknown column types and make them type safe. – NightOwl888 Dec 11 '19 at 19:24