2

I need your help. I am trying to get distinct values from List of objects. My class looks like this:

class Chromosome
{
    public bool[][] body { get; set; }
    public double fitness { get; set; }
}

Now I have List<Chromosome> population. And now what I need is a way, how I can get new list: List<Chromosome> newGeneration. This new list will contain only unique chromosomes from original list - population.

Chromosome is unique, when his whole body (which in this case is 2D bool array) is unique in comparison to the other chromosomes. I know, that there is something like MoreLINQ, but I am not sure, whether I should use 3rd party code and I know that I should overwrite some methods, but I am kind of lost. So I would really appreciate some nice step by step description, that even idiot could accomplish :) THX
svick
  • 236,525
  • 50
  • 385
  • 514
Ludo
  • 35
  • 1
  • 5
  • Could you please give us some example input / output? – It'sNotALie. Jun 09 '13 at 15:51
  • Well I am doing genetic algorithm and body of Chromosome represents binary representations of association rule. Body represents the whole rule, which I later decode and than evaluate its quality based on fitness function and save it to the fitness. Code behind it is much larger, I have just abstracted from it. Because I just don't know any elegant way, how I can get a list of distinct objects (which are compared based on the similarity of 2D array of bools) of other List... – Ludo Jun 09 '13 at 15:55

2 Answers2

5

First, implement the equality operator (this goes into class Chromosome):

public class Chromosome : IEquatable<Chromosome>
{

    public bool[][] body { get; set; }
    public double fitness { get; set; }

    bool IEquatable<Chromosome>.Equals(Chromosome other)
    {
        // Compare fitness
        if(fitness != other.fitness) return false;

        // Make sure we don't get IndexOutOfBounds on one of them
        if(body.Length != other.body.Length) return false;

        for(var x = 0; x < body.Length; x++)
        {
            // IndexOutOfBounds on inner arrays
            if(body[x].Length != other.body[x].Length) return false;

            for(var y = 0; y < body[x].Length; y++)
                // Compare bodies
                if(body[x][y] != other.body[x][y]) return false;
        }

        // No difference found
        return true;
    }

    // ReSharper's suggestion for equality members

    public override bool Equals(object obj)
    {
        if (ReferenceEquals(null, obj))
        {
            return false;
        }
        if (ReferenceEquals(this, obj))
        {
            return true;
        }
        if (obj.GetType() != this.GetType())
        {
            return false;
        }
        return this.Equals((Chromosome)obj);
    }

    public override int GetHashCode()
    {
        unchecked
        {
            return ((this.body != null ? this.body.GetHashCode() : 0) * 397) ^ this.fitness.GetHashCode();
        }
    }
}

Then, use Distinct:

var newGeneration = population.Distinct().ToList();
pascalhein
  • 5,700
  • 4
  • 31
  • 44
  • 5
    You really should override `Equals` and `GetHashCode` together. Also, if you don't feel like adding this directly to your class, you can use `Distinct` with `IEqualityComparer` overload. – Patryk Ćwiek Jun 09 '13 at 15:57
  • 1
    @Trustme-I'maDoctor Your username fits this question so well. – It'sNotALie. Jun 09 '13 at 15:58
  • @Trustme-I'maDoctor: what would you propose for `GetHashCode`? I can't think of anythink that makes sense and is easy to implement – pascalhein Jun 09 '13 at 16:01
  • @csharpler I either use the default that ReSharper suggests me (I don't remember it now) or the one from effective Java, that's based on primes, [Jon Skeet's answer shows it best](http://stackoverflow.com/a/263416/1180426) – Patryk Ćwiek Jun 09 '13 at 16:02
  • I have maybe stupid question :) but how can I have in function void and than use return? – Ludo Jun 09 '13 at 16:06
  • @Trustme-I'maDoctor: I added ReSharper's suggestion for the equality members, thanks for the hint – pascalhein Jun 09 '13 at 16:09
  • Resharper doesn't seem to be aware that you want value-based equality on body in the GetHashCode implementation it's recommending. Hmm, fitness shouldn't be in there. – Amy B Jun 09 '13 at 16:38
  • well I have been thinking about it too. But than I have realized, that if there are 2 chromosomes with the same body, they have to bee equal also in their fitness. So as soon as the fitness is different I can assume, that whole body is different too. – Ludo Jun 09 '13 at 17:01
0
public class ChromosomeBodyComparer : IEqualityComparer<Chromosome>
{
  private bool EqualValues(bool[][] left, bool[][] right)
  {
    if (left.Length != right.Length)
    {
      return false;
    }
    return left.Zip(right, (x, y) => x.SequenceEquals(y)).All();
  }

  public bool Equals(Chromosome left, Chromosome right)
  {
    return EqualValues(left.body, right.body)
  }

     //implementing GetHashCode is hard.
     // here is a rubbish implementation.
  public int GetHashCode(Chromosome c)
  {
    int numberOfBools = c.body.SelectMany(x => x).Count();
    int numberOfTrues = c.body.SelectMany(x => x).Where(b => b).Count();
    return (17 * numberOfBools) + (23 * numberOfTrues);

  }
}

Called by:

List<Chromosome> nextGeneration = population
  .Distinct(new ChromosomeBodyComparer())
  .ToList();
Amy B
  • 108,202
  • 21
  • 135
  • 185