5

I have a list (to be precise ImmutableHashSet<ListItem> from System.Collections.Immutable) of base items and try to call the following code

_baseList.Contains(derivedItem)

but this returns false.

Even though the following code lines all return true

object.ReferenceEquals(_baseList.First(), derivedItem)
object.Equals(_baseList.First(), derivedItem)
_baseList.First().GetHashCode() == derivedItem.GetHashCode()

I can even write the following and it returns true:

_baseList.OfType<DerivedClass>().Contains(derivedItem)

What am I doing wrong, I would like to avoid writing the .OfType stuff.

Edit:

private ImmutableHashSet<BaseClass> _baseList;

public class BaseClass
{

}

public class DerivedClass : BaseClass
{

}

public void DoStuff()
{
    var items = _baseList.OfType<DerivedClass>().ToList();
    foreach (var derivedItem in items)
    {
        RemoveItem(derivedItem);
    }
}

public void RemoveItem(BaseClass derivedItem)
{
    if (_baseList.Contains(derivedItem))
    {
        //doesn't reach this place, since _baseList.Contains(derivedItem) returns false...
        _baseList = _baseList.Remove(derivedItem);
    }

    //object.ReferenceEquals(_baseList.First(), derivedItem) == true
    //object.Equals(_baseList.First(), derivedItem) == true
    //_baseList.First().GetHashCode() == derivedItem.GetHashCode() == true
    //_baseList.OfType<DerivedClass>().Contains(derivedItem) == true
}

Edit2:

Here a reproducible code of my problem, seems like ImmutableHashSet<> caches GetHashCode and doesn't compare the current GetHashCode with the entries inside the list, is there a way to tell ImmutableHashSet<> that the GetHashCode of the items could be different, atleast for the item I am currently checking since hey its the damn same reference...

namespace ConsoleApplication1
{
    class Program
    {
        private static ImmutableHashSet<BaseClass> _baseList;

        static void Main(string[] args)
        {
            _baseList = ImmutableHashSet.Create<BaseClass>();
            _baseList = _baseList.Add(new DerivedClass("B1"));
            _baseList = _baseList.Add(new DerivedClass("B2"));
            _baseList = _baseList.Add(new DerivedClass("B3"));
            _baseList = _baseList.Add(new DerivedClass("B4"));
            _baseList = _baseList.Add(new DerivedClass("B5"));

            DoStuff();
            Console.WriteLine(_baseList.Count); //output is 5 - put it should be 0...
            Console.ReadLine();
        }

        private static void DoStuff()
        {
            var items = _baseList.OfType<DerivedClass>().ToList();
            foreach (var derivedItem in items)
            {
                derivedItem.BaseString += "Change...";
                RemoveItem(derivedItem);
            }
        }

        private static void RemoveItem(BaseClass derivedItem)
        {
            if (_baseList.Contains(derivedItem))
            {
                _baseList = _baseList.Remove(derivedItem);
            }
        }
    }

    public abstract class BaseClass
    {
        private string _baseString;
        public string BaseString
        {
            get { return _baseString; }
            set { _baseString = value; }
        }

        public BaseClass(string baseString)
        {
            _baseString = baseString;
        }

        public override int GetHashCode()
        {
            unchecked
            {
                int hashCode = (_baseString != null ? _baseString.GetHashCode() : 0);
                return hashCode;
            }
        }
    }
    public class DerivedClass : BaseClass
    {
        public DerivedClass(string baseString)
            : base(baseString)
        {

        }
    }
}

If I would change the ImmutableHashSet<> to ImmutableList<> the code works fine, so if you guys don't come up with any good idea I will switch to the list.

Rand Random
  • 7,300
  • 10
  • 40
  • 88
  • 1
    But doesn't it just fail because the baseList is filled with `ListItem` and your object you're searching for is `DerivedClass` – EaterOfCode Jun 04 '15 at 13:22
  • @EaterOfCode the DerivedClass derives from the baseclass. – Rand Random Jun 04 '15 at 13:22
  • @EaterOfCode and if the list wouldn't contain any DerivedClass the OfType<>().Contains would also return false, which isn't the case. – Rand Random Jun 04 '15 at 13:23
  • Does ImmutableHashSet really have an indexer? I couldn't find that in the documentation. I don't understand why `_baseList[0]` compiles. – usr Jun 04 '15 at 13:32
  • @usr I actually did it with .First() in my code, sorry for the mistake in here. – Rand Random Jun 04 '15 at 13:33
  • 2
    Did you override any equality methods? – usr Jun 04 '15 at 13:35
  • 2
    What equality methods are defined in `BaseClass` and `DerivedClass`? And what is the KeyComparer used in the HashSet? – Rob Jun 04 '15 at 13:35
  • @Weston also the reversed arguments return true. – Rand Random Jun 04 '15 at 13:36
  • I think this is ultimately your problem is identified in this question: [How to find if two objects are equal](http://stackoverflow.com/questions/2920399/c-sharp-how-to-find-if-two-objects-are-equal). your presuming that `GetHashCode == equality`, that's not necessarily true. *It's entirely reasonable (and expected) that you will occasionally have values which are unequal but give the same hash* I'm guessing if you compare your objects using `==` you get `false?` Also *Generally speaking, equality becomes tricky when inheritance gets involved* – Liam Jun 04 '15 at 13:36
  • 1
    @usr yes I did override equality methods but since all of the return true, there shouldn't be a problem and since I am using an ImmutableHASHSET its only checking the GetHashCode which is the same or? – Rand Random Jun 04 '15 at 13:38
  • @Liam thanks for the link, but nothing new was there - please have a look at my edit. – Rand Random Jun 04 '15 at 14:28
  • 1
    For a hash table to work the hash code cannot change. The hash code determines the bucket that lookups target. – usr Jun 04 '15 at 14:38
  • 2
    What is the point of an immutable collection if you are going to mutate the content? That makes no sense, there is nothing preventing me from changing the values on each and every item in the collection! – Zache Jun 04 '15 at 14:43

2 Answers2

4

Objects that are used in dictionaries and other hashing-related data structures should have immutable identity - all hashing-related data structures assume that once you add the object to the dictionary, its hashcode is not going to change.

This code is not going to work:

    private static void DoStuff()
    {
        var items = _baseList.OfType<DerivedClass>().ToList();
        foreach (var derivedItem in items)
        {
            derivedItem.BaseString += "Change...";
            RemoveItem(derivedItem);
        }
    }

    private static void RemoveItem(BaseClass derivedItem)
    {
        if (_baseList.Contains(derivedItem))
        {
            _baseList = _baseList.Remove(derivedItem);
        }
    }

_baseList.Contains() in RemoveItem(), as called by DoStuff() is going to return false for every single item, because you changed the identity of the stored item - its BaseString property.

antiduh
  • 11,853
  • 4
  • 43
  • 66
  • I understood it so far, after I saw what was going on, so basically my question right now it there a way around it other than using List? Your answer implies it but maybe you can give a more direct yes or no thanks in advance. :) – Rand Random Jun 04 '15 at 14:38
  • @RandRandom Why are you mutating the items in the collection (in ways that changes their identity) in the first place? – Servy Jun 04 '15 at 14:55
  • 2
    @RandRandom - If you want to modify the hashed object, you have to remove it from the hashing data structure, modify it, and re-add it. It's really bad practice to do so however. And if you have a large program and these objects are passed around all over the place, and you allow modification to the object's identity, then you're setting yourself up for failure because some unrelated piece of code could modify the hashed object without notifying the owner of the hashing data structure. – antiduh Jun 04 '15 at 14:56
4

I think you answered your own question in your edit. You can't have the hashCode change once you've added the item to the HashSet. That breaks the contract of how a HashSet works.

See this excellent article by Eric Lippert for more information on the topic.

In particular, it says the following:

Guideline: the integer returned by GetHashCode should never change

Ideally, the hash code of a mutable object should be computed from only fields which cannot mutate, and therefore the hash value of an object is the same for its entire lifetime.

However, this is only an ideal-situation guideline; the actual rule is:

Rule: the integer returned by GetHashCode must never change while the object is contained in a data structure that depends on the hash code remaining stable

It is permissible, though dangerous, to make an object whose hash code value can mutate as the fields of the object mutate. If you have such an object and you put it in a hash table then the code which mutates the object and the code which maintains the hash table are required to have some agreed-upon protocol that ensures that the object is not mutated while it is in the hash table. What that protocol looks like is up to you.

If an object's hash code can mutate while it is in the hash table then clearly the Contains method stops working. You put the object in bucket #5, you mutate it, and when you ask the set whether it contains the mutated object, it looks in bucket #74 and doesn't find it.

Remember, objects can be put into hash tables in ways that you didn't expect. A lot of the LINQ sequence operators use hash tables internally. Don't go dangerously mutating objects while enumerating a LINQ query that returns them!

EDIT: BTW, Your post and your subsequent edit are a perfect example of why you should always post a complete and reproducible working code of your problem from the beginning, instead of trying to filter out what you feel is irrelevant information. Pretty much anyone looking at your post an hour ago could have given you the correct answer in a split second had they had all the relevant information to begin with.

Community
  • 1
  • 1
sstan
  • 35,425
  • 6
  • 48
  • 66