1

With a collection of Rules I am trying to create another collection of Rules ignoring the Site property and creating a unique list.

public class Rule
{
    public int TestId { get; set; }
    public string File { get; set; }
    public string Site { get; set; }
    public string[] Columns { get; set; }
}

So if my collection had values like below:

var rules = new List<Rule>
{
    new Rule { TestId = 1, File = "Foo", Site = "SiteA", Columns = new string[] { "ColA", "ColB" }},
    new Rule { TestId = 1, File = "Foo", Site = "SiteB", Columns = new string[] { "ColA", "ColB" }}
};

I am wanting the end result

var uniqueRules = new List<Rule>
{
    new Rule { TestId = 1, File = "Foo", Site = null, Columns = new string[] { "ColA", "ColB" }}
};

Having tried various combinations like below I'm still getting 2 results back, how do I achieve the expected result?

var uniqueRules = rules
    .GroupBy(r => new { r.TestId, r.File, r.Columns })
    .Select(g => g.Key)
    .Distinct()
    .ToList();
mheptinstall
  • 2,109
  • 3
  • 24
  • 44

4 Answers4

6

The problem is that a string[] has not overridden Equals and GetHashCode, that's why just the references are compared at r.Columns. You need to provide a custom IEqualityComparer<T>:

public class RuleComparer : IEqualityComparer<Rule>
{
    public bool Equals(Rule x, Rule y)
    {
        if (object.ReferenceEquals(x, y)) return true;
        if (x == null || y == null) return false;
        if(!(x.TestId == y.TestId && x.File == y.File)) return false;
        return x.Columns.SequenceEqual(y.Columns);
    }

    // from: https://stackoverflow.com/questions/263400/what-is-the-best-algorithm-for-an-overridden-system-object-gethashcode
    public int GetHashCode(Rule obj)
    {
        unchecked
        {
            int hash = 17;
            hash = hash * 23 + obj.TestId.GetHashCode();
            hash = hash * 23 + (obj.File?.GetHashCode() ?? 0);
            foreach(string s in obj.Columns)
                hash = hash * 23 + (s?.GetHashCode() ?? 0);
            return hash;
        }
    }
}

Now the LINQ query becomes trivial:

List<Rule> uniqueRules = rules.Distinct(new RuleComparer()).ToList();
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
1

There are several observations to be made here:

  1. GroupBy() will have the same effect as doing Distinct(). So either create an EqualityComparer that will perform the comparison for you, or just do GroupBy(), no need to do both.

  2. You're getting the Key after the grouping. You probably want the entire object back, so use .First() if you want an actual Rule, and don't care which one it is if multiple ones are in the same grouping.

  3. The rules are distinct because the Columns are references to different arrays, which are not compared by value but by reference.

To combine all these observations, you could use the following code if you don't want to write a custom EqualityComparer and go the grouping way:

var uniqueRules = rules
        .GroupBy(r => new { r.TestId, r.File, Columns = string.Join(",", r.Columns) })
        .Select(r => r.First())
        .ToList();

This will simply use a string for the columns, making it a property that is also compared by value.

Note that this is only possible due to the fact that Columns is a simple array of strings. For more complex types this can't be done as conveniently.

ThoNohT
  • 41
  • 1
  • 2
0

I would recommend to extend your class Rule, to implement equals method as below:

public class Rule :IEquatable<Rule>
    {
        public int TestId { get; set; }
        public string File { get; set; }
        public string Site { get; set; }
        public string[] Columns { get; set; }

        public bool Equals(Rule other)
        {
            return TestId == other.TestId &&
                   string.Equals(File, other.File) &&
                   Equals(Columns, other.Columns);
        }
    }

As you see we ignore the Site field when comparing the two classes. This also gives you the flexibility of altering your comparison in future. And then use : rules.Distinct();

PiJei
  • 584
  • 4
  • 19
0

The problem is that although Columns both look alike new string[] { "ColA", "ColB" } but the are not referencing the same object and they only have equal data. try this:

string[] cols = new string[] { "ColA", "ColB" };
var rules = new List<Rule>
{
    new Rule { TestId = 1, File = "Foo", Site = "SiteA", Columns = cols},
    new Rule { TestId = 1, File = "Foo", Site = "SiteB", Columns = cols}
};

Now your own query should work correctly:

var uniqueRules = rules
    .GroupBy(r => new { r.TestId, r.File, r.Columns })
    .Select(g => g.Key)
    .Distinct()
    .ToList();
Ashkan Mobayen Khiabani
  • 33,575
  • 33
  • 102
  • 171