114

I have method with HashSet parameter. And I need to do case-insensitive Contains within it:

public void DoSomething(HashSet<string> set, string item)
{
    var x = set.Contains(item);
    ... 
}

Is it any way to make existing HashSet case-insensitive (do not create new one)?

I'm looking for solution with best perfomance.

Edit

Contains can be called multiple times. So IEnumerable extensions are not acceptable for me due to lower perfomance than native HashSet Contains method.

Solution

Since, answer to my question is NO, it is impossible, I've created and used following method:

public HashSet<string> EnsureCaseInsensitive(HashSet<string> set)
{
    return set.Comparer == StringComparer.OrdinalIgnoreCase
           ? set
           : new HashSet<string>(set, StringComparer.OrdinalIgnoreCase);
}
Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
wishmaster
  • 1,469
  • 3
  • 13
  • 20
  • 7
    You'll probably have to create a new one... – It'sNotALie. Jun 04 '13 at 16:02
  • Possible duplicate: http://stackoverflow.com/questions/2667635/how-to-use-hashsetstring-contains-method-in-case-insensitive-mode (see user414076's answer) – Esoteric Screen Name Jun 04 '13 at 16:04
  • 1
    You need to decide up front whether or not the `HashSet` considers case by supplying a comparer. However, it's worth considering that the set {"A","a"} will only contain one item with a case-insensitive comparer. – spender Jun 04 '13 at 16:08

7 Answers7

184

The HashSet<T> constructor has an overload that lets you pass in a custom IEqualityComparer<string>. There are a few of these defined for you already in the static StringComparer class, a few of which ignore case. For example:

var set = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
set.Add("john");
Debug.Assert(set.Contains("JohN"));

You'll have to make this change at the time of constructing the HashSet<T>. Once one exists, you can't change the IEqualityComparer<T> it's using.


Just so you know, by default (if you don't pass in any IEqualityComparer<T> to the HashSet<T> constructor), it uses EqualityComparer<T>.Default instead.


Edit

The question appears to have changed after I posted my answer. If you have to do a case insensitive search in an existing case sensitive HashSet<string>, you will have to do a linear search:

set.Any(s => string.Equals(s, item, StringComparison.OrdinalIgnoreCase));

There's no way around this.

Community
  • 1
  • 1
Timothy Shields
  • 75,459
  • 18
  • 120
  • 173
  • If you are doing a single lookup - this is worse than just looping across the hashset – Dave Bish Jun 04 '13 at 16:10
  • @DaveBish I believe the OP changed his question to say "do not create new one" after I had answered... (edits very soon after posting don't actually count as edits). - If the OP has to do this with *an existing* `HashSet`, then of course he will have to do a linear time search. – Timothy Shields Jun 04 '13 at 16:14
  • 1
    That's not what I'm saying. If he is only doing one lookup against the hashset - creating a new one is more costly than a linear scan. (Op didn't specify) – Dave Bish Jun 04 '13 at 16:20
  • 3
    @DaveBish That's exactly why I edited my answer to include the LINQ linear scan. :) – Timothy Shields Jun 04 '13 at 16:23
  • 2
    Here is an alternative, but I would prefer **a clearer LINQ solution above**. You can use `Enumerable.Contains(this IEnumerable source, TSource value, IEqualityComparer comparer)` like this: `set.Contains(item, StringComparison.OrdinalIgnoreCase)`. It will perform generally the same linear search, though Resharper will generate a "Possibly unintended linear search in set" warning. – Corio Feb 14 '19 at 15:19
  • Out of curiosity, wouldn't using `new HashSet(StringComparer.OrdinalIgnoreCase)` result in the possibility that you could have two items with different case be in the set? Comparison is only done after jumping to the bucket where the hash points to. Therefore, wouldn't 'a' and 'A' potentially point to different buckets and therefore not discover that there is already another item that is the same and then add it to the set? – Adrian Jun 02 '22 at 18:44
11

You can not magically make case-sensetive HashSet (or Dictionary) to behave in case-insensitive way.

You have to recreate one inside your function if you can not rely on incoming HashSet to be case-insensitive.

Most compact code - use constructor from existing set:

var insensitive = new HashSet<string>(
   set, StringComparer.InvariantCultureIgnoreCase);

Note that copying HashSet is as expensive as walking through all items, so if your function does just on search it would be cheaper (O(n)) to iterate through all items. If your function called multiple times to make single case-insensitive search you should try to pass proper HashSet to it instead.

Quppa
  • 1,841
  • 20
  • 18
Alexei Levenkov
  • 98,904
  • 14
  • 127
  • 179
4

Assuming you've got this extension method:

public static HashSet<T> ToHashSet<T>(this IEnumerable<T> source)
{
    return new HashSet<T>(source);
}

You can just use this:

set = set.Select(n => n.ToLowerInvariant()).ToHashSet();

Or, you could just do this:

set = new HashSet(set, StringComparer.OrdinalIgnoreCase); 
//or InvariantCultureIgnoreCase or CurrentCultureIgnoreCase
It'sNotALie.
  • 22,289
  • 12
  • 68
  • 103
  • 1
    If you are doing a single lookup - this is worse than just looping across the hashset – Dave Bish Jun 04 '13 at 16:05
  • It would grab a lot of memory and do a lot of hash calculations, then throw all that work away after one lookup. Looping over the whole hash set and doing case-insensitive comparisons runs in constant memory and doesn't have to calculate hashes. Both need to touch the entirety of `set` in any case. –  Jun 04 '13 at 16:09
  • Because making a new hashset will at-least have to loop over the whole thing! – Dave Bish Jun 04 '13 at 16:10
  • @DaveBish The most upvoted answer also happens to do that... it needs to reconstruct it as well... – It'sNotALie. Jun 04 '13 at 16:10
  • I posted on that one, too :) – Dave Bish Jun 04 '13 at 16:11
4

The HashSet is designed to quickly find elements as per its hashing function and equality comparator. What you are asking for is really to find an element matching "some other" condition. Imagine that you have a Set<Person> objects that uses only Person.Name for comparison and you need to find an element with some given value of Person.Age.

The point is you need to iterate over the contents of the set to find the matching elements. If you are going to be doing this often you might create a different Set, in you case using a case-insensitive comparator but then you would have to make sure that this shadow set is in sync with the original.

The answers so far are essentially variations of the above, I thought to add this to clarify the fundamental issue.

Miserable Variable
  • 28,432
  • 15
  • 72
  • 133
2

The constructor of HashSet can take an alternative IEqualityComparer that can override how equality is determined. See the list of constructors here.

The class StringComparer contains a bunch of static instances of IEqualityComparers for strings. Particularly, you're probably interested in StringComparer.OrdinalIgnoreCase. Here is the documentation of StringComparer.

Note that another constructor takes in an IEnumerable, so you can construct a new HashSet from your old one, but with the IEqualityComparer.

So, all together, you want to convert your HashSet as follows:

var myNewHashSet = new HashSet(myOldHashSet, StringComparer.OrdinalIgnoreCase);
Ben Reich
  • 16,222
  • 2
  • 38
  • 59
-1

If you want to leave the original, case-sensitive version in place, you could just query it with linq with case insensitivity:

var contains = set.Any(a => a.Equals(item, StringComparison.InvariantCultureIgnoreCase));
eouw0o83hf
  • 9,438
  • 5
  • 53
  • 75
-2

You can now use

set.Contains(item, StringComparer.OrdinalIgnoreCase);

without needing to re-create you HashSet

Elanis
  • 184
  • 3
  • 12