0

I'm working on a presentation of two views of the same data, which is stored in a datatable with these fields (irrelevant fields not shown):

First view is the datatable itself:

Table

Second view is a Pivot-like table where lines are defined by same set_id and user_id and columns are defined by sets of three with same date but storing different values: rows.Count, rows.Sum(value) and rows.Sum(volumes). So I wrote this code:

Sub Test()
    ' Simulated Data
    Dim dtDA = New DataTable
    dtDA.Columns.Add("id", GetType(Integer))
    dtDA.Columns.Add("set_id", GetType(Integer))
    dtDA.Columns.Add("user_id", GetType(Integer))
    dtDA.Columns.Add("date", GetType(Integer))
    dtDA.Columns.Add("value", GetType(Integer))
    dtDA.Columns.Add("volumes", GetType(Integer))
    dtDA.Rows.Add(1001, 1, 11, 20160505, 1, 1)
    dtDA.Rows.Add(1002, 1, 12, 20160505, 1, 2)
    dtDA.Rows.Add(1003, 1, 13, 20160505, 1, 1)
    dtDA.Rows.Add(1004, 1, 11, 20160505, 1, 1)
    dtDA.Rows.Add(1005, 2, 14, 20160505, 1, 1)
    dtDA.Rows.Add(1006, 2, 15, 20160505, 1, 2)
    dtDA.Rows.Add(1007, 2, 16, 20160505, 1, 1)
    dtDA.Rows.Add(1008, 2, 14, 20160505, 1, 1)
    dtDA.Rows.Add(1009, 1, 12, 20160512, 1, 1)
    dtDA.Rows.Add(1010, 1, 13, 20160512, 1, 2)
    dtDA.Rows.Add(1011, 1, 11, 20160512, 1, 1)
    dtDA.Rows.Add(1012, 1, 12, 20160512, 1, 1)
    dtDA.Rows.Add(1013, 2, 15, 20160512, 1, 1)
    dtDA.Rows.Add(1014, 2, 16, 20160512, 1, 2)
    dtDA.Rows.Add(1015, 2, 14, 20160512, 1, 1)
    dtDA.Rows.Add(1016, 2, 15, 20160512, 1, 1)

    'Analysis
    Dim DS = dtDA.Select.GroupBy(
        Function(dr) New With {
            .set_id = dr.Field(Of Integer?)("set_id").GetValueOrDefault,
            .user_id = dr.Field(Of Integer?)("user_id").GetValueOrDefault,
            .date = dr.Field(Of Integer?)("date").GetValueOrDefault}
        ).GroupBy(
        Function(gSRD) New With {
            .set_id = gSRD.Key.set_id,
            .user_id = gSRD.Key.user_id})

    Dim dtDS As New DataTable, drDS As DataRow
    dtDS.Columns.Add("set_id", GetType(Integer))
    dtDS.Columns.Add("user_id", GetType(String))
    dtDS.Columns.Add("date", GetType(Integer))
    If DS.Any Then
        For Each d In DS.SelectMany(Function(g) g.Select(Function(gg) gg.Key.date)).Distinct.OrderBy(Function(i) i)
            dtDS.Columns.Add("c" & d, GetType(Integer))
            dtDS.Columns.Add("p" & d, GetType(Integer))
            dtDS.Columns.Add("v" & d, GetType(Integer))
        Next
        For Each gSR In DS
            drDS = dtDS.NewRow
            drDS.SetField("set_id", gSR.Key.set_id)
            drDS.SetField("user_id", gSR.Key.user_id)
            For Each gSRD In gSR
                drDS.SetField("c" & gSRD.Key.date, gSRD.Count)
                drDS.SetField("p" & gSRD.Key.date, gSRD.Sum(Function(dr) dr.Field(Of Integer?)("value").GetValueOrDefault))
                drDS.SetField("v" & gSRD.Key.date, gSRD.Sum(Function(dr) dr.Field(Of Integer?)("volumes").GetValueOrDefault))
            Next
            dtDS.Rows.Add(drDS)
        Next
    End If
End Sub

I was expecting to get this:

set_id|user_id|c20160505|p20160505|v20160505|c20160512|p20160512|v20160512
------+-------+---------+---------+---------+---------+---------+---------
     1|     11|        2|        2|        2|        1|        1|        1
     1|     12|        1|        1|        2|        2|        2|        2
     1|     13|        1|        1|        1|        1|        1|        2
     2|     14|        2|        2|        2|        1|        1|        1
     2|     15|        1|        1|        2|        2|        2|        2
     2|     16|        1|        1|        1|        1|        1|        2

Instead, I'm getting this:

Result

Can anyone help me and tell what I'm doing wrong here? Thank you very much!

EDIT

The problem seems to be I based my code on the premise that Linq grouping by key as Anonymous Type would behave as described in https://stackoverflow.com/a/12124777/3718031 but unfortunately it seems it is not quite so. I noticed that if I replace anonymous types by concatenating strings, my code does what I first expected:

    Dim DS = dtDA.Select.GroupBy(
        Function(dr) dr.Field(Of Integer?)("set_id").GetValueOrDefault & "," & dr.Field(Of Integer?)("user_id").GetValueOrDefault & "," & dr.Field(Of Integer?)("date").GetValueOrDefault
        ).GroupBy(
        Function(gSRD) gSRD.First.Field(Of Integer?)("set_id").GetValueOrDefault & "," & gSRD.First.Field(Of Integer?)("user_id").GetValueOrDefault)

    Dim dtDS As New DataTable, drDS As DataRow
    dtDS.Columns.Add("set_id", GetType(Integer))
    dtDS.Columns.Add("user_id", GetType(String))
    If DS.Any Then
        For Each d In DS.SelectMany(Function(g) g.Select(Function(gg) gg.First.Field(Of Integer?)("date").GetValueOrDefault)).Distinct.OrderBy(Function(i) i)
            dtDS.Columns.Add("c" & d, GetType(Integer))
            dtDS.Columns.Add("p" & d, GetType(Integer))
            dtDS.Columns.Add("v" & d, GetType(Integer))
        Next
        For Each gSR In DS
            drDS = dtDS.NewRow
            drDS.SetField("set_id", gSR.First.First.Field(Of Integer?)("set_id").GetValueOrDefault)
            drDS.SetField("user_id", gSR.First.First.Field(Of Integer?)("user_id").GetValueOrDefault)
            For Each gSRD In gSR
                drDS.SetField("c" & gSRD.First.Field(Of Integer?)("date").GetValueOrDefault, gSRD.Count)
                drDS.SetField("p" & gSRD.First.Field(Of Integer?)("date").GetValueOrDefault, gSRD.Sum(Function(dr) dr.Field(Of Integer?)("value").GetValueOrDefault))
                drDS.SetField("v" & gSRD.First.Field(Of Integer?)("date").GetValueOrDefault, gSRD.Sum(Function(dr) dr.Field(Of Integer?)("volumes").GetValueOrDefault))
            Next
            dtDS.Rows.Add(drDS)
        Next
    End If

Pivot-Table Correct Result

So, the remaining question is: why are not anonymous types working as grouping keys based on equality of each one of their properties?

Community
  • 1
  • 1
VBobCat
  • 2,527
  • 4
  • 29
  • 56

0 Answers0