0

this is my query:

rows.GroupBy(row => new TaxGroupObject
            {
                EnvelopeID = row.Field<int>("EnvelopeID"),
                PolicyNumber = row.Field<string>("PolicyNumber"),
                TZ = row.Field<string>("TZ")
            })
            .Select(row =>

                        {
                            int i;
                            if (row.Key.EnvelopeID == 5713 && row.Key.PolicyNumber == "50002617" && row.Key.TZ == "50002617") 
                                i=1+1;
                            var newRow = structure.NewRow();
                            newRow["PolicyNumber"]=row.Key.PolicyNumber;
                            newRow["TZ"]=row.Key.TZ;
                            newRow["CreditPremiaTaxParagraph45"] = row.Sum(x => decimal.Parse(x["CreditPremiaTaxParagraph45"].ToString()));
                            newRow["WorklossTax"] = row.Sum(x => decimal.Parse(x["WorklossTax"].ToString()));
                            newRow["MiscTax"] = row.Sum(x => decimal.Parse(x["MiscTax"].ToString()));
                            newRow["EnvelopeID"] = row.Key.EnvelopeID;
                            return newRow;
                        }
            );
    internal class TaxGroupObject
{
    public long? EnvelopeID{ get; set; }
    public string PolicyNumber { get; set; }
    public string TZ { get; set; }

}

i put a breakpoint on the line with "i=1+1", after an if condition comparing all the keys i've used for the group by with some hard coded values. that break point is being hit twice, although the group by suppose to group all rows with same keys together. the thing is that for most of the values in the table the grouping works just fine and i cant understand how its possible. if you can help in any way it would be highly appreciated.

Yuval Perelman
  • 4,499
  • 1
  • 22
  • 32

2 Answers2

5

The problem is that TaxGroupObject does not implement GetHashCode and Equals. These methods are used by GroupBy to determine what makes one TaxGroupObject object equal to another. By default, it's by reference equality, not property equality.

This should work, using the GetHashCode algorithm from What is the best algorithm for an overridden System.Object.GetHashCode?:

internal class TaxGroupObject
{
    public long? EnvelopeID { get; set; }
    public string PolicyNumber { get; set; }
    public string TZ { get; set; }

    public override int GetHashCode()
    {
        unchecked // Overflow is fine, just wrap
        {
            int hash = 17;
            hash = hash * 23 + EnvelopeID.GetHashCode();
            hash = hash * 23 + (PolicyNumber != null ? PolicyNumber.GetHashCode() : -2);
            hash = hash * 23 + (TZ != null ? TZ.GetHashCode() : -1);
            return hash;
        }
    }

    public override bool Equals(object obj)
    {
        if (obj.GetType() != typeof(TaxGroupObject))
            return false;
        var other = (TaxGroupObject)obj;
        return this.EnvelopeID == other.EnvelopeID &&
                this.PolicyNumber == other.PolicyNumber &&
                this.TZ == other.TZ;
    }
}

Also, you should only use immutable objects in something like a grouping or dictionary. At a minimum, you must be sure that the objects here do not change during your grouping.

Community
  • 1
  • 1
Tim S.
  • 55,448
  • 7
  • 96
  • 122
  • It is extremely important that the three properties that the hash code is computed on do not change. They must be changed to getter only properties. – Enigmativity Feb 14 '15 at 13:21
  • @Enigmativity I agree that they must be immutable for the life of the grouping, and might be best as getter-only depending on how this class is used elsewhere, but to say they *must* be getter-only is a little extreme, IMO. – Tim S. Feb 14 '15 at 13:24
  • @TimS. - Perhaps, but it is worth making the point because some people wouldn't understand the implication. It's also one of those things that we should get into a habit of doing right. – Enigmativity Feb 14 '15 at 22:57
0

eventually i found it simpler to give up inheritance and use a struct instead of class, it also works since struct is a value type therefore doesn't need equals method override. I am intrested in which of these approaches bring better performance, if anybody knows. Intuitively it seems like structs are more efficient, but I am not sure, and I currently don't have the time to emulate the two options or make the proper re(google)search.
Thanks

Yuval Perelman
  • 4,499
  • 1
  • 22
  • 32