2

I am trying to create an elegant and extensible way of querying a dictionary which maps an enum to a set of strings.

So I have this class SearchFragments that has the dictionary in it. I then want a method wherein consumers of this class can simply ask "HasAny" and, this is the bit where I am struggling, simply pass in some query like expression and get the boolean answer back.

public class SearchFragments
{
    private readonly IDictionary<SearchFragmentEnum, IEnumerable<string>> _fragments;

    public SearchFragments()
    {
        _fragments = new Dictionary<SearchFragmentEnum, IEnumerable<string>>();
    }

    public bool HasAny(IEnumerable<SearchFragmentEnum> of)
    {
        int has = 0;
        _fragments.ForEach(x => of.ForEach(y => has += x.Key == y ? 1 : 0));
        return has >= 1;
    }
}

The problem with the way this currently is, is that consumers of this class now have to construct an IEnumerable<SearchFragmentEnum> which can be quite messy.

What I am looking for is that the consuming code will be able to write something along the lines of:

searchFragments.HasAny(SearchFragmentEnum.Name, SearchFragmentEnum.PhoneNumber)

But where that argument list can vary in size (without me having to write method overloads in the SearchFragments class for every possible combination (such that if new values are added to the SearchFragmentEnum at a future date I won't have to update the class.

Thomas Cook
  • 4,371
  • 2
  • 25
  • 42

3 Answers3

5

You can use params[]

public bool HasAny(params SearchFragmentEnum[] of)
{ ...

Sidenote: you know that LIN(Q) queries should just query a source and never cause any side-effects? But your query does unnecessarily increment the integer:

_fragments.ForEach(x => of.ForEach(y => has += x.Key == y ? 1 : 0));

Instead use this (which is also more efficient and more readable):

return _fragments.Keys.Intersect(of).Any();

An even more efficient alternative to this is Sergey's idea:

return of?.Any(_fragments.ContainsKey) == true; 
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • Thankyou, I was not aware of that, makes sense I guess it's the same kind of logical principle as the single responsibility from SOLID. I mean what I'd writtten compiles, but you raise a valid point. I will apply what you've said here :-) – Thomas Cook Jun 23 '17 at 13:48
  • Am I right in thinking that intersect is analogous (at least conceptually) to SQL inner join? By this I mean it does a boolean intersection of both collections? – Thomas Cook Jun 23 '17 at 13:53
  • 1
    @ThomasCook: `Intersect` is a set method that builds the intersection of two sequences(by ignoring duplicates). But since it is deferred executed it doesn't need to build the whole intersection. The `Any` at the end will stop as soon as one was found. If you wanted to know how many intersect you could append `Count` instead of `Any`. But then all must be evaluated. – Tim Schmelter Jun 23 '17 at 13:54
  • @ThomasCook: if you want to compare with sql methods, it's similar to the T-SQL method [`INTERSECT`](https://learn.microsoft.com/en-us/sql/t-sql/language-elements/set-operators-except-and-intersect-transact-sql) – Tim Schmelter Jun 23 '17 at 13:58
  • I would recommend against using `_fragments.Keys` - that can be quite slow for ConcurrentDictionary due to locking. https://github.com/dotnet/corefx/blob/master/src/System.Collections.Concurrent/src/System/Collections/Concurrent/ConcurrentDictionary.cs – mjwills Jun 23 '17 at 14:02
  • 1
    Intersect is uneccessary. it is in O(N+M). This is a dictionary. You can just simply iterate the input array and check for containment. – adjan Jun 23 '17 at 14:05
  • @adjan: you haven't understood how `Intersect` and LINQ's deferred execution works – Tim Schmelter Jun 23 '17 at 14:07
  • I would presume that `Intersect` works by generating a Hashset behind the scenes (https://stackoverflow.com/a/14527633/34092). I'd be surprised if that was faster than `public bool HasAny(params SearchFragmentEnum[] of) { return of.Any(_fragments.ContainsKey); } ` since that just enumerates `of` once and then does a small number of (cheap) hash lookups. I could be wrong though. – mjwills Jun 23 '17 at 14:13
  • @mjwills: you really care about 1ms more or less? That's micro optimization at it's "best". I have edited my answer to adapt Sergeys approach. It's really nice, but not because it might be more efficient but because it's still clear and readable. – Tim Schmelter Jun 23 '17 at 14:18
  • 1
    Have a read of what the `Intersect` method does. https://github.com/Microsoft/referencesource/blob/master/System.Core/System/Linq/Enumerable.cs - in short it takes the second parameter and converts it to a set and then iterates over the first parameter and does hash lookups. This is an odd solution since your first parameter is **already** a `Dictionary` - it can already do hash lookups! @adjan 's earlier comment was correct. You are right that since it is deferred it doesn't need to build the whole intersection - but it does need to take the hit to (unnecessarily) build the set upfront. – mjwills Jun 23 '17 at 14:29
  • @mjwills: it takes the second(which is the `params[]`, so probably one or two parameters not much more, otherwise the sequence overload would have been used). Then it enumerates the `KeyCollection`(not the dictionary as you've said) and looks if a key was in this little set. If so it is yielded. Due to the `Any` it will stop after the first found key. – Tim Schmelter Jun 23 '17 at 14:54
  • Correct @TimSchelter - that is what it does. Apologies for not mentioning that explicitly. – mjwills Jun 23 '17 at 22:03
3

For variable sized arguments in c# you use the params keyword: public int HasAny(params SearchFragmentEnum[] of)

The .Net API usually offers a couple of overloads of this for performance reasons; the parameters passed are copied into a new array. Explicitely providing overloads for the most common cases avoids this.

public int HasAny(SearchfragmentEnum of1)
public int HasAny(SearchFragmentEnum of1, SearchFragmentEnum of2)
etc.

Instead of using params you could also consider marking your enum with the [Flags] attribute. Parameters could than be passed like HasAny(SearchFragmentEnum.Name | SearchFragmentEnum.PhoneNumber. Examples abundant on StackOverflow (e.g. Using a bitmask in C#)

brijber
  • 711
  • 5
  • 17
3

Use the params keyword to allow a varying number of arguments. Further, you can simplify your code by looping over the smaller of array. Also, you are using a dictionary that has O(1) key check, so it is uneccessary to have an inner loop:

public bool HasAny(params SearchFragmentEnum[] of)
{
    foreach(var o in of) {
        if (this._fragments.ContainsKey(o))
            return true;
    }   
    return false;
}

or shorter with LINQ

public bool HasAny(params SearchFragmentEnum[] of) {
    return of?.Any(_fragments.ContainsKey) ?? false;
} 
adjan
  • 13,371
  • 2
  • 31
  • 48