Loop over data object of strings construction of a unique distinct collection

Question

i don't know how but for weeks now i was using a HashSet<myObject> collection of mainly strings members as i really though it's internally uses a builtin approach as dictionary to avoid duplicate items in a non KVP format of data(single columns)

my scenario is :

HashSet<myHddFolderPaths> UniqColleciton = new HashSet<myHddFolderPtahs>
int countRounds=0;
void addToCollection()
{

    for(int i=0, i < UniqColleciton.Count; i++)
    {
        add some items to UniqColleciton via Directory.GetDirectories();
    }
    if(countRounds++ < Limit)  
       addToCollection()
}

this is a pattern for dir-walker i am building and this is only an example for a scenario when recurse of same data could not be avoided so i don't recall where i have read about it and thought, that by simply using a HashSet<T> would "just take Care of business"

i haven't thought of the option it will allow duplicates but in this project i put it to a test and it did allow to my surprise to add an existing items so my work around is :

Dictionary<string, int> fiterAccesDeniedPaths = new Dictionary<string, int>();
Dictionary<string, int> fiterAccesiblePaths = new Dictionary<string, int>();

if (this.fiterAccesDeniedPaths.ContainsKey(e.Message)) continue;
if (this.fiterAccessiblePaths.ContainsKey(object.stringPathMember)) continue;
add to filters ; UniqColleciton.Add(myHddFolderPaths);

is there a better/more efficient approach for acomplishing this task ?

public class FolderPath
{
    public string DriveL { get; set; }
    public string FolderLevel { get; set; }
    public string Path { get; set; }
    public int Fsize { get; set; }
}


    public class GenericUniqCollectionM<T> : HashSet<T>
    {
        public GenericUniqCollectionM():base()
        {

        }
    }

What is `myHddFolderPaths`? Your own class I guess, could you post the code. — Ivan Stoev, Nov 27 '15 at 11:01
@IvanStoev do i need to implement it on my object or on the collection, i will now add another code for the `MyUniqCollection` — Jbob Johan, Nov 27 '15 at 11:10
Nevermind, see the @Arie answer. But if you need multiple unique constrains/indexes, multi dictionary approach is just fine, but then no need of `HashSet`, you can use just list. Of course all this should be encapsulated in a custom collection. — Ivan Stoev, Nov 27 '15 at 11:16

score 1 · Answer 1 · answered Nov 27 '15 at 11:05

Ypu wanted your HashSet to "take care of business". HashSet does exactly that. But you first have to let it know what you consider a "duplicate" (or rather when your objects should be considered equal). To do so, you should implement (override) GetHashCode() method on your myHddFolderPaths class.

How does HashSet compare elements for equality?

Implementing GetHashCode correctly

Default implementation for Object.GetHashCode()

Jakub Lortz · Accepted Answer · 2015-11-27T12:30:50.203

A HashSet created with the parameterless constructor uses a default equality comparer. The default comparer will use FolderPath.Equals() to check equality.

internal class ObjectEqualityComparer<T> : EqualityComparer<T>
{
    public override bool Equals(T x, T y)
    {
        if (x != null)
        {
            if (y != null) return x.Equals(y);
            return false;
        }
        if (y != null) return false;
        return true;
    }

    public override int GetHashCode(T obj)
    {
        if (obj == null) return 0;
        return obj.GetHashCode();
    }

    ...
}

You didn't override Equals and GetHashCode, so it will use the default implementation provided by object, checking reference equality.

You have two options now. One is to override Equals and GetHashCode in FolderPath.

public class FolderPath
{
    ...

    public override bool Equals(object obj)
    {
        if (obj == null) return false;

        FolderPath other = obj as FolderPath;
        if (other == null) return false;

        //simple implementation, only compares Path
        return Path == other.Path;
    }

    public override int GetHashCode()
    {
        if (Path == null) return 0;
        return Path.GetHashCode();
    }
}

The other one is to implement a custom IEqualityComparer<FolderPath>

public class FolderPathComparer : IEqualityComparer<FolderPath>
{
    public bool Equals(FolderPath x, FolderPath y)
    {
        if (x != null)
        {
            if (y != null) return x.Path == y.Path;
            return false;
        }
        if (y != null) return false;
        return true;
    }

    public int GetHashCode(FolderPath obj)
    {
        if (obj == null || obj.Path == null) return 0;
        return obj.Path.GetHashCode();
    }
}

and pass it to the HashSet constructor.

var set = new HashSet<FolderPath>(new FolderPathComparer());

i have made an edit to my question could you please implement the important part of your code on `GenericUniqCollectionM` as a sample? — Jbob Johan, Nov 27 '15 at 11:15
so now on this implementation if i add to any of my classes > `MycustomObject : ObjectEqualityComparer` this should do the job ? — Jbob Johan, Nov 27 '15 at 11:55
i was in too busy i forgot to thank you it really solved my approach using an wkward solution with dictionary filters. thank you very much Mr Lortz — Jbob Johan, Nov 27 '15 at 23:18
@JbobJohan You're welcome. You accepted the answer - that's already considered a 'thank you'. — Jakub Lortz, Nov 27 '15 at 23:22

Loop over data object of strings construction of a unique distinct collection

2 Answers2