0

Many times over the years, I have needed code that does:

Find a value in a dictionary; if it is not there, add it to the dictionary (and return that new value).

For example:

    // Only one per account, so loading can be efficiently managed.
    // <AccountID, LCProfilePicture>
    public readonly static Dictionary<int, LCProfilePicture> All = new Dictionary<int, LCProfilePicture>();

    public static LCProfilePicture GetOrCreate( int accountID )
    {
        LCProfilePicture pic;
        if (!All.TryGetValue( accountID, out pic )) {
            pic = new LCProfilePicture( accountID );
            All[ accountID ] = pic;
        }

        return pic;
    }

Instead of having to write that boilerplate each time, I'd like to have a generic method that will do the work. How to do so in c#?

So far, I have thought of three ways to proceed:

  1. Wrap the construction that will be needed if the dictionary does not already contain an object for the key, into an Action (or Func?). Then call that if necessary.

  2. Require TValue to have a constructor of that form, and then somehow describe that requirement as a constraint on the generic method.

  3. Define some interface that TValue has to satisfy, and somehow use that interface to write the generic method.


I think I know how to do #1, so will submit an answer doing so, as soon as I work out the details. UPDATE: I have now worked that out, and posted that as an answer.

But maybe #2 is possible? Then I could just add that constraint, and be done.

  • Pro: easier to use (don't have to wrap the construction into an Action or Func).
  • Con: Not as flexible (if have a TValue that does not have such a constructor, can't use this generic method).

(#3 seems less promising; I mention it for completeness.)

ToolmakerSteve
  • 18,547
  • 14
  • 94
  • 196
  • 1
    If you already have a working solution, then you shouldn't be asking a question here. You could consider asking on Code Review if you want your working code reviewed. – Servy Feb 03 '17 at 20:07
  • 3
    @Servy it says in question the solution won't compile – aw04 Feb 03 '17 at 20:09
  • @aw04 It says right in the question that they have a working version of one of the solutions that they thought of in response to the problem that they describe above it. – Servy Feb 03 '17 at 20:10
  • @ToolmakerSteve Adding emphasis to the fact that you once had code that didn't compile before you got to your working solution doesn't change the fact that you have a working solution. – Servy Feb 03 '17 at 20:12
  • @ToolmakerSteve You didn't read what I said correctly. – Servy Feb 03 '17 at 20:13
  • @Servy oh i see what you're referring to, fair point – aw04 Feb 03 '17 at 20:13
  • @ToolmakerSteve So you don't in fact have a working solution to your problem, even though you said you have a working solution to your problem, and have posted a working solution to your problem? – Servy Feb 03 '17 at 20:14
  • @ToolmakerSteve you do say you likely know how to do the first option – aw04 Feb 03 '17 at 20:14
  • @ToolmakerSteve Why are you being so obtuse? If you hadn't even attempted to solve a simple problem that you know how to solve then *you weren't ready to post an SO question yet*. SO questions are expected to be well researched, not questions that you could answer if you bothered to spend a few minutes trying to solve, but where you aren't interested in taking the time to do so, making it a very low quality question for the site, whether your answer it yourself or not. – Servy Feb 03 '17 at 20:25
  • 2
    https://stackoverflow.blog/2011/07/its-ok-to-ask-and-answer-your-own-questions/ - according to SO, "To be crystal clear, it is not merely OK to ask and answer your own question, it is explicitly encouraged." – Dave Smash Feb 03 '17 at 21:00
  • @ElementalPete And yet those questions need to be quality quesitons. It's not just magically okay to ask low quality questions without doing your research just because you post an answer to them. *Good quality* self-answered quesitons are encouraged. Low quality self-answered questions are not. – Servy Feb 03 '17 at 21:15
  • @ToolmakerSteve Regardless of how simple it is or isn't (it took you all of 5 minutes to write the answer from scratch, so it can't be *that* complicated) it's readily accessible with a simple web search. Simply putting your question into google gives you the answer. You're not adding to the sum of programming knowledge by re-posting it, you're just repeating already readily available information. That's not useful. – Servy Feb 03 '17 at 21:17
  • @Servy - If he hadn't asked the question, he wouldn't have figured out that reflection is a way better solution than the one he had in mind, because it requires no interface and no complex creator method. :) Based on question upvotes/downvotes, it seems like there are way worse questions on this board that would be much easier for you to pick apart. Either way, I'll bow out of the discussion and leave you guys to your flame war! – Dave Smash Feb 03 '17 at 21:22
  • @ElementalPete Reflection is an *awful* solution to this problem. It loses static type checking, is *dramatically* slower, and provides no indication to the caller of the method on how to use it appropriately. It has lots of disadvantages and no advantages. – Servy Feb 03 '17 at 21:27
  • Static type checking is lost when you pass the strongly typed objects to a generic method, and is regained to a certain extent by using reflection (which is why you can determine if the proper constructor exists with reflection, but not with a generic type.) It provides the same amount of indication on how to use it properly as any other method call. I listed a couple of advantages, and while taking 50 ms instead of 10 ms is indeed a dramatic slowdown (the main disadvantage), it's warranted and unnoticeable in many cases. – Dave Smash Feb 03 '17 at 21:33
  • 1
    @ElementalPete: Educated guess based on experience and not actually measuring this particular code, is that you aren't looking at 50ms vs 10ms, but 15ms vs .015ms That is, if the number of calls is small it doesn't matter, but for repeated calls, the reflection causes a HUGE penalty. Caching can and should be used to mitigate that. – Ben Voigt Feb 03 '17 at 21:35
  • @Servy - This is not an easy problem that can be trivially solved with an internet search - for me. I've been using generics, lambdas, all the pieces of this problem for years. Yet I've encountered this particular situation multiple times over the years, and could not quite grasp how to solve it. And yes, I did an internet search to try to find a similar question. Yesterday, I almost had the solution in hand, but did not have confidence I would grasp the final step... – ToolmakerSteve Feb 04 '17 at 18:24
  • @Servy - ... The reason I feel (very strongly) this is a "stackoverflow" question rather than a "code review" question, is that the essence of the question is "given generics, there ought to be a way to write this method once, rather than copy/paste/modify every time". By your definition of what is appropriate for code review, almost any question involving generics would have to be moved to code review - because *of course* I already could do it in a non-generic way. I stand by my view that this is a perfect fit to stackoverflow: a concise bit of knowledge, often needed, not easily found. – ToolmakerSteve Feb 04 '17 at 18:29
  • @ElementalPete Static type checking is *not* lost when you use generics. It's quite trivial to provide a solution that is entirely statically typed. Your solution has no static typing at all. There is no way for the compiler to know if the code will work in your case, there is only a runtime exception if you do it wrong, rather than *the code not compiling* if you do it wrong. In your case the caller just need to "know" that they need to have a constructor of a given type; if you, say, pass a delegate, the delegate tells you exactly what you need to provide, and won't compile without it. – Servy Feb 05 '17 at 00:57
  • @ToolmakerSteve But the problem *is* trivially solved with an internet search. It took you all of 5 minutes to solve the problem from the time you formed the question. You literally only had to try and you virtually instantly were able to solve the problem. Had you tried and failed, that may have been different. You simply *didn't bother to try* and then *trivially* solved the problem almost *instantly* the moment you *did* try. – Servy Feb 05 '17 at 01:00
  • @ToolmakerSteve You stated, in your question, that you knew *how to solve your problem generically* when you asked it. Your problem is not that you had a non-generic solution and didn't know how to solve it generically. Since you *already had your generic solution* you shouldn't have been asking a question. – Servy Feb 05 '17 at 01:01
  • Comments are not for extended discussion; this conversation has been [moved to chat](http://chat.stackoverflow.com/rooms/134908/discussion-on-question-by-toolmakersteve-convenience-method-to-find-or-add-value). – Bhargav Rao Feb 05 '17 at 16:22

4 Answers4

3

You can combine constraints of new() and an interface for setting the key, like this:

interface IWithKey<T> {
    public T Key { get; set; }
}
static class DictExtensions {
    public static TVal GetorCreate<TKey,TVal>(this IDictionary<TKey,TVal> d, TKey key) where TVal : new(), IWithKey<TKey> {
        TVal res;
        if (!d.TryGetValue(key, out res)) {
            res = new TVal();
            res.Key = key;
            d.Add(key, res);
        }
        return res;
    }
}

Since GetorCreate is an extension, you can use it as follows:

static LCProfilePicture GetOrCreatePic( int accountID ) {
    return All.GetOrCreateEntry(accountID);
}
Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • This is a useful solution, in cases where a) TVal *does* need to have the key as a field. b) The programmer is willing and able to add the interface to the class definition. I really appreciate seeing how to do this, in case I ever need it. Thank you. However, my preference is to have a more general solution, that does not place those requirements on TVal. Upvoted, but not accepted as the (most general) answer. – ToolmakerSteve Feb 03 '17 at 21:25
1

I noticed in your example you have a static dictionary

// Only one per account, so loading can be efficiently managed.
// <AccountID, LCProfilePicture>
public readonly static Dictionary<int, LCProfilePicture> All = 
  new Dictionary<int, LCProfilePicture>();

My first reaction to that is, since it is static, are you going to need it to be thread safe. If the answer is yes, maybe, or even no, then the answer might be, don't write it yourself, let Microsoft do it.

System.Collections.Concurrent.ConcurrentDictionary<TKey, TValue>

Which so happens to have 2 built in functions

TValue GetOrAdd(TKey key, TValue value)
TValue GetOrAdd(TKey key, Func<TKey, TValue> func)

And all of that done in a thread-safe manner. The second one where the parameter is a Func is the maybe the answer you are looking for.

If you are set on simplifying the usage, I would argue against having the loading of the data be part of the TValue. That is mostly based on my own person preference to store POCO (Plain Old CLR Objects) as values is Dictionaries and not objects with State and Behavior.

I would instead, move the "loading/constructing/deserializing" behavior to another service and/or the Dictionary itself.

This example creates a base class that you inherit from

public abstract class SmartConcurrentDictionaryBase<TKey, TValue> : 
  System.Collections.Concurrent.ConcurrentDictionary<TKey, TValue>
{
  public TValue GetOrAdd(TKey key) { return GetOrAdd(key, LoadNewValue); }
  protected abstract TValue LoadNewValue(TKey key);
}

public class LCProfilePictureDictionary : SmartConcurrentDictionaryBase<int, LCProfilePicture>
{
  protected override LCProfilePicture(int accountID)
  { 
    return new LCProfilePicture(accountID);
  }
}

// use is
// var pic = All.GetOrAdd(accountID);

This example is more of a reusable Dictionary object and takes in a Func as a constructor parameter, but could easily be changed to include an Interface where one of the functions on the interface match the pattern.

public class SimpleConcurrentDictionary<TKey, TValue> :
  System.Collections.Concurrent.ConcurrentDictionary<TKey, TValue>
{
  private readonly Func<TKey, TValue> _loadFunc;
  public SimpleConcurrentDictionary(Func<TKey, TValue> loadFunc)
  {
    _loadFunc = loadFunc;
  }
  public TValue GetOrAdd(TKey key) { return GetOrAdd(key, _loadFunc); }
} 
Andre
  • 754
  • 1
  • 10
  • 16
  • `ConcurrentDictionary` is perfect! I did not realize there was a dictionary implementation that already had a method that solved this, `TValue GetOrAdd(TKey key, Func func)` – ToolmakerSteve Feb 05 '17 at 19:02
0

Solution #1 (most general):

public static TValue GetOrCreateEntry<TKey, TValue>( Dictionary<TKey, TValue> dict, TKey key, Func<TValue> creator )
{
    TValue value;
    if (!dict.TryGetValue( key, out value )) {
        value = creator();
        dict[ key ] = value;
    }

    return value;
}

Example usage:

static LCProfilePicture GetOrCreatePic( int accountID )
{
    return GetOrCreateEntry<int, LCProfilePicture>( All, accountID, () => new LCProfilePicture( accountID ) );
}

Solution #2 (for TValues that remember their key):

public static TValue GetOrCreateEntry<TKey, TValue>( Dictionary<TKey, TValue> dict, TKey key, Func<TKey, TValue> creator )
{
    TValue value;
    if (!dict.TryGetValue( key, out value )) {
        value = creator(key);
        dict[ key ] = value;
    }

    return value;
}

Example usage:

static LCProfilePicture GetOrCreatePic( int accountID )
{
    return GetOrCreateEntry<int, LCProfilePicture>( All, accountID, key => new LCProfilePicture( key ) );
}

Comparison of Solution #1 and Solution 2:

Solution #1 is more general - it can be used even for TValues that don't need to know about the key.

Solution #2 is cleaner style, for TValues that do retain a reference to the key.

Two reasons #2 is preferable, where appropriate:

  • Reason #1: Solution #1 has the possibility of abuse: consider the case where TValue has two constructors, a parameterless one, and one that takes key as a parameter. An inexperienced programmer might use Solution #1 like this:

static LCProfilePicture GetOrCreatePic( int accountID )
{
    // OOPS, programmer has not set the key field to "accountID".
    return GetOrCreateEntry<int, LCProfilePicture>( All, accountID, () => new LCProfilePicture() );
}

If the lead programmer / architect wants to avoid that possibility, omit Solution #1, and only offer #2. In that case, the attempted usage won't compile, because there is no matching constructor.

  • Reason #2: Using Solution #1 requires including a second copy of the key in the usage, if TValue needs to capture it. This unnecessarily encapsulates the key in the Func instance, and could lead to accidentally referring to a different key, e.g.:

//...
int keyA;
int keyB;
// OOPS, programmer referred to the wrong key the second time.
// Maybe copy/pasted code, and only changed it in the first location, not realizing it is used in two places.
var valA = GetOrCreateEntry<int, LCProfilePicture>( All, keyA, () => new LCProfilePicture( keyB) );  
enter code here
ToolmakerSteve
  • 18,547
  • 14
  • 94
  • 196
  • 1
    I added the part that lets you supply the key to the code that constructs the result. Passing it "on the side" results in capturing `accountId` unnecessarily. – Sergey Kalinichenko Feb 03 '17 at 20:14
  • @dasblinkenlight - interestingly, I had thought of doing it that way at first, but then changed to what I did, because I thought that might be unnecessarily restrictive. Suppose someone has a "creator" that doesn't want/need "key" as a parameter? Perhaps only the dictionary "knows" the association between the key and the value, and the value itself does not. Reverting your edit. – ToolmakerSteve Feb 03 '17 at 20:19
  • The flip side here is, of course, the need to repeat `accountId` twice in the code. You can always provide two overrides - one with `Func` and one with `Func`, and have C# compiler distinguish between them based on the kind of closure the programmer passes. – Sergey Kalinichenko Feb 03 '17 at 20:23
  • @dasblinkenlight - excellent point. I will edit the answer to include both versions. – ToolmakerSteve Feb 03 '17 at 21:26
  • 1
    I suggest `public static class ConstructFromKey { public static Func UsesKey; public static Func IgnoresKey; }`, and using that as a "default" when a one-argument overload is called. The first time through when the static fields are both null, reflection can be used to fill it in. – Ben Voigt Feb 03 '17 at 21:33
  • @BenVoigt - I'm not seeing when that would be useful. I see two types of TValue classes: those that don't use the key, so would only want Solution #1, and those that do use the key, so would only want Solution #2. Don't see when one would want a single definition that supported both forms. Unless there is such a situation, I consider it preferable to use the "least abstract" solution, which is to have two helper methods. Otherwise, it becomes rather indirect for the goal - more effort to comprehend. – ToolmakerSteve Feb 03 '17 at 21:53
  • @ToolmakerSteve: I'm just thinking that a single program might have multiple dictionaries, with multiple `TValue` types. So you may need to support both. On the other hand, it shouldn't be much extra work to have the cached delegate always accept the key, and if reflection found there's no constructor that uses it, generate a delegate that discards its parameter before calling the constructor. So yeah, the `IgnoresKey` field can be left out entirely. – Ben Voigt Feb 03 '17 at 22:07
  • So far I prefer this solution (with dasblinkenlight's suggestion added, and BenVoigt's optionally added if performance becomes an issue - but not my concern yet, so have not done so here) to the other answers. But I cannot accept my own answer for 48 hours. My apologies if you are reading this after 48 hours, and I have forgotten to go back and accept an answer. – ToolmakerSteve Feb 03 '17 at 22:59
0

System.Reflection has a ConstructorInfo object and a GetConstructor method that can be used for this purpose. ConstructorInfo.Invoke returns an object of the type that you used to create your ConstructorInfo. If you went the reflection route, it would look something like this (not tested, but should be close):

//using System.Reflection;
public static TValue GetOrCreateEntry<TKey, TValue>(Dictionary<TKey, TValue> dict, TKey key)
{
    TValue value;
    if (!dict.TryGetValue(key, out value))
    {
        // not in dictionary
        ConstructorInfo ctor = typeof(TValue).GetConstructor(new Type[] { typeof(TKey) });
        if (ctor != null)
        {
            // we have a constructor that matches the type you need
            value = (TValue)ctor.Invoke(new object[] { key });
            dict[key] = value;
            return value;
        }
        else
            throw new NotImplementedException(); // because the TValue type does not implement the constructor you anticipate
    }

    // we got it from dictionary, so just return it
    return value;
}
Dave Smash
  • 2,941
  • 1
  • 18
  • 38
  • It'd be much MUCH faster to store the reflection lookup result as a `Converter` delegate and just call the delegate on subsequent calls, instead of using reflection every single time. – Ben Voigt Feb 03 '17 at 21:27
  • @BenVoigt - I'm not familiar with that technique, but it sounds promising! I'll check it out next time I'm working with something like this. My experience with reflection is that the slowdown makes a big difference in a big loop, but is not noticeable if you're running a method like this once on a button click, for example. – Dave Smash Feb 03 '17 at 21:38
  • If you decide to look into that, the portion of [this answer by Marc Gravell](http://stackoverflow.com/a/1601674/103167) that deals with expression trees is the way to go. – Ben Voigt Feb 03 '17 at 21:43
  • @BenVoigt - This is an interesting approach that Pete proposes. Do you see any reason to consider this further, or does your suggested refinement to my post, accomplish the same end result? Or perhaps your suggestion on my answer essentially means to incorporate this reflection logic? – ToolmakerSteve Feb 03 '17 at 22:09
  • 1
    @Toolmaker: If caching is added, the reflection-based approach has the advantage of requiring no effort at the call site (and your "Reason #1" doesn't apply). Of course, it can be combined with your two methods via overloading. – Ben Voigt Feb 03 '17 at 22:11