143

Is there a collection in C# that will not let you add duplicate items to it? For example, with the silly class of

public class Customer {
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public string Address { get; set; }

    public override int GetHashCode() {
        return (FirstName + LastName + Address).GetHashCode();
    }

    public override bool Equals(object obj) {
        Customer C = obj as Customer;
        return C != null && String.Equals(this.FirstName, C.FirstName) && String.Equals(this.LastName, C.LastName) && String.Equals(this.Address, C.Address);
    }
}

The following code will (obviously) throw an exception:

Customer Adam = new Customer { Address = "A", FirstName = "Adam", LastName = "" };
Customer AdamDup = new Customer { Address = "A", FirstName = "Adam", LastName = "" };

Dictionary<Customer, bool> CustomerHash = new Dictionary<Customer, bool>();
CustomerHash.Add(Adam, true);
CustomerHash.Add(AdamDup, true);

But is there a class that will similarly guarantee uniqueness, but without KeyValuePairs? I thought HashSet<T> would do that, but having read the docs it seems that class is just a set implementation (go figure).

nawfal
  • 70,104
  • 56
  • 326
  • 368
Adam Rackis
  • 82,527
  • 56
  • 270
  • 393
  • 4
    I don't understand your problem with `HashSet`. MSDN says "The HashSet class provides high-performance set operations. A set is a collection that contains no duplicate elements, and whose elements are in no particular order." – Daniel Hilgarth Mar 01 '11 at 17:11
  • 5
    Can you explain more why `HashSet` is insufficient? – JaredPar Mar 01 '11 at 17:11
  • @mootinator: The `Dictionary` class *does not* guarantee any kind of order. – LukeH Mar 01 '11 at 17:23
  • 3
    I guess he just wants to throw an exception when you try to add an existing value... To do this, just check the bool value returned from `HashSet.Add` method and throw when `false`... – digEmAll Mar 01 '11 at 17:24
  • Customer is a seriously flawed example. – H H Mar 01 '11 at 17:53
  • @Henk Holterman - can you elaborate what's flawed about this example? As I understand it, the main requirements for overriding GetHashCode and Equals is that 2 object *must* have the same hash if they're equal, and that the hash function should be evenly distributed. Assuming a random distribution of names, what's wrong with this silly implementation? – Adam Rackis Mar 01 '11 at 18:55
  • 2
    It's also strongly recommende to only overload those for _immutable_ types only. A mutable Customer would usually fair better with the default Reference-equality. – H H Mar 01 '11 at 18:59
  • Yes, you're right (obviously). If you add the customer to the HashSet, and then change it's properties, bad things will happen. Thanks again. – Adam Rackis Mar 01 '11 at 19:02

8 Answers8

263

HashSet<T> is what you're looking for. From MSDN (emphasis added):

The HashSet<T> class provides high-performance set operations. A set is a collection that contains no duplicate elements, and whose elements are in no particular order.

Note that the HashSet<T>.Add(T item) method returns a bool -- true if the item was added to the collection; false if the item was already present.

Donut
  • 110,061
  • 20
  • 134
  • 146
  • 16
    T item in this case should implement IEquatable interface. If class does not inherit this interface, HashSet adds duplicate elements. – Rudolf Dvoracek Oct 15 '18 at 11:04
  • 6
    Or instead of the item implementing `IEquatable`, you can pass a (custom) implementation of `EqualityComparer` instance to the `HashSet` constructor. – Sipke Schoorstra Jun 20 '19 at 19:52
18

How about just an extension method on HashSet?

public static void AddOrThrow<T>(this HashSet<T> hash, T item)
{
    if (!hash.Add(item))
        throw new ValueExistingException();
}
Jonathon Reinhart
  • 132,704
  • 33
  • 254
  • 328
15

From the HashSet<T> page on MSDN:

The HashSet(Of T) class provides high-performance set operations. A set is a collection that contains no duplicate elements, and whose elements are in no particular order.

(emphasis mine)

Oded
  • 489,969
  • 99
  • 883
  • 1,009
5

If all you need is to ensure uniqueness of elements, then HashSet is what you need.

What do you mean when you say "just a set implementation"? A set is (by definition) a collection of unique elements that doesn't save element order.

Lloyd
  • 1,324
  • 7
  • 10
  • You're completely right; the question was kind of stupid. Basically, I was looking for something that would throw an exception when a duplicate was added (like Dictionary), but as already mentioned, HashSet returns false on a duplicate add. +1, thank you. – Adam Rackis Mar 01 '11 at 18:59
3

You can try HashSet<T>

Cheng Chen
  • 42,509
  • 16
  • 113
  • 174
3

Just to add my 2 cents...

if you need a ValueExistingException-throwing HashSet<T> you can also create your collection easily:

public class ThrowingHashSet<T> : ICollection<T>
{
    private HashSet<T> innerHash = new HashSet<T>();

    public void Add(T item)
    {
        if (!innerHash.Add(item))
            throw new ValueExistingException();
    }

    public void Clear()
    {
        innerHash.Clear();
    }

    public bool Contains(T item)
    {
        return innerHash.Contains(item);
    }

    public void CopyTo(T[] array, int arrayIndex)
    {
        innerHash.CopyTo(array, arrayIndex);
    }

    public int Count
    {
        get { return innerHash.Count; }
    }

    public bool IsReadOnly
    {
        get { return false; }
    }

    public bool Remove(T item)
    {
        return innerHash.Remove(item);
    }

    public IEnumerator<T> GetEnumerator()
    {
        return innerHash.GetEnumerator();
    }

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return this.GetEnumerator();
    }
}

this can be useful for example if you need it in many places...

digEmAll
  • 56,430
  • 9
  • 115
  • 140
0

You may look into something kind of Unique List as follows

public class UniqueList<T>
{
    public List<T> List
    {
        get;
        private set;
    }
    List<T> _internalList;

    public static UniqueList<T> NewList
    {
        get
        {
            return new UniqueList<T>();
        }
    }

    private UniqueList()
    {            
        _internalList = new List<T>();
        List = new List<T>();
    }

    public void Add(T value)
    {
        List.Clear();
        _internalList.Add(value);
        List.AddRange(_internalList.Distinct());
        //return List;
    }

    public void Add(params T[] values)
    {
        List.Clear();
        _internalList.AddRange(values);
        List.AddRange(_internalList.Distinct());
       // return List;
    }

    public bool Has(T value)
    {
        return List.Contains(value);
    }
}

and you can use it like follows

var uniquelist = UniqueList<string>.NewList;
uniquelist.Add("abc","def","ghi","jkl","mno");
uniquelist.Add("abc","jkl");
var _myList = uniquelist.List;

will only return "abc","def","ghi","jkl","mno" always even when duplicates are added to it

Vinod Srivastav
  • 3,644
  • 1
  • 27
  • 40
0

As an overall check different methods here are 4 ways to check if the collection has not any duplicates:

public static bool LinqAny<T>(IEnumerable<T> enumerable)
{
    HashSet<T> set = new();

    return enumerable.Any(element => !set.Add(element));
}

public static bool LinqAll<T>(IEnumerable<T> enumerable)
{
    HashSet<T> set = new();

    return !enumerable.All(set.Add);
}

public static bool LinqDistinct<T>(IEnumerable<T> enumerable)
{
    return enumerable.Distinct().Count() != enumerable.Count();
}

public static bool ToHashSet<T>(IEnumerable<T> enumerable)
{
    return enumerable.ToHashSet().Count != enumerable.Count();
}
Majid Shahabfar
  • 4,010
  • 2
  • 28
  • 36