6

Let's suppose I have the following piece of code:

IEnumerable<string> allKeys = _cache.Select(o => o.Key);
Parallel.ForEach(allKeys, key => _cache.Remove(key));

As you can see, I'm retrieving all the keys in _cache, storing them in my local variable allKeys, and then concurrently removing all the keys from _cache.

I would like however, to do it one single line. So what comes to mind is:

Parallel.ForEach(_cache.Select(o => o.Key), key => _cache.Remove(key));

But the statement _cache.Select(o => o.Key) would be called on each loop iteration, therefore retrieving different amount of elements each time (because I'm deleting them at the same time).

Is the latter line of code safe?

Does _cache.Select(o => o.Key) in loop statements, only get called once, and then each iteration uses the original result, or is it processed in every iteration step?

Matias Cicero
  • 25,439
  • 13
  • 82
  • 154
  • 1
    What type is _cache? – Paddy Jun 04 '15 at 13:32
  • 3
    What would be the advantage of doing this in a single line? – L-Four Jun 04 '15 at 13:32
  • 2
    _"But the statement _cache.Select(o => o.Key) would be called on each loop iteration"_ - are you sure? – CodeCaster Jun 04 '15 at 13:33
  • @Paddy `_cache` is of type `ObjectCache` [Documentation](https://msdn.microsoft.com/en-us/library/system.runtime.caching.objectcache%28v=vs.110%29.aspx) – Matias Cicero Jun 04 '15 at 13:33
  • 3
    Irrespective of parallelism, you can't mutate the enumerable while you are iterating it. You'll need to materialize `allKeys` before removing. – StuartLC Jun 04 '15 at 13:34
  • Basically no data structure you can come up with is going to support removing items in parallel. This code doesn't even *work*, so trying to get it into one line isn't going to be helpful. – Servy Jun 04 '15 at 13:35
  • @L-Three No relevant purpose really, just because I'd like to have it in one line, and I wanted to know if it could be done – Matias Cicero Jun 04 '15 at 13:35
  • @StuartLC > agree with you, and it will look cleaner and clearer. Assuming this, even the first example on two lines (which returns IEnumerable) is not fine – AFract Jun 04 '15 at 13:35
  • @Servy But `ObjectCache` is a *key-value* kind of HashMap. Each element is deleted accesing it by an unique key, not by an index. So paralelism is actually possible, right? – Matias Cicero Jun 04 '15 at 13:37
  • @MatiCicero No, that is not true at all. At best, it'll simply serialize all of the calls, preventing them from ever actually doing anything in parallel. A hash map is not inherently thread safe as a data structure though. – Servy Jun 04 '15 at 13:38

4 Answers4

4

First off, both codes are the same. There is no difference if you have a temporary variable or not.

Second: This code is flawed.

  1. LINQ uses deferred execution. In other words, while iterating allKeys, the underlying data - in your case _cache - is being iterated. In combination with the remove, this won't work.
  2. _cache most likely is a normal dictionary or something similar. In other words, it is not thread safe. UPDATE: According to a comment, it is of type ObjectCache and that type is indeed thread-safe. So this issue doesn't occur in your specific case.
Daniel Hilgarth
  • 171,043
  • 40
  • 335
  • 443
  • The first point will depend on the data structure. It's certainly *possible* (but unlikely) to have a data structure that would support that. The latter is the actual deal breaker. At best, the data structure could serialize the removals, keeping it safe; there's just about no way for it to actually work in parallel. – Servy Jun 04 '15 at 13:36
4

As you can see, I'm retrieving all the keys in _cache, storing them in my local variable allKeys

No, you don't. Due to something called deferred execution, all you store is the command to get all your keys. You need to materialize this command to actually do what you think you do:

var allKeys = _cache.Select(o => o.Key).ToList();

That said: Is your cache thread safe? Why does it not feature a Clear method? Getting all keys and removing it by using multi threading seems like a not-so-great-idea.

If you insist to have it all in one line, you could use PLINQ:

_cache.Select(o => o.Key).AsParallel().ForAll(key => _cache.Remove(key));

But again: this seems to be a bad idea.

nvoigt
  • 75,013
  • 26
  • 93
  • 142
1

Would it not be more efficient to Dispose of your existing _cache object and just recreate it, rather than removing every item individually?

Saves querying and looping...

Paddy
  • 33,309
  • 15
  • 79
  • 114
  • I thought of disposing it, but my cache is of type `MemoryCache`, so it's not recommended according to this post: [How do I clear a System.Runtime.Caching.MemoryCache?](http://stackoverflow.com/questions/8043381/how-do-i-clear-a-system-runtime-caching-memorycache) – Matias Cicero Jun 04 '15 at 13:40
  • OK, interesting, but possibly (as also referred in answers there), just setting _cache to be a new instance would suffice? – Paddy Jun 04 '15 at 14:02
0

First off, you can't modify a collection while iterating over it, which is what .Select<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, TResult> selector) does behind the scenes. So this line

Parallel.ForEach(_cache.Select(o => o.Key), key => _cache.Remove(key));

will simply not work.

You could try this

Parallel.ForEach(_cache.Select(o => o.Key).ToList(), key => _cache.Remove(key));

which will work on a copy of the keys. The ObjectCache type is thread safe, as is MemoryCache so you should be fine.

The only potential issue here is if this code is in a multithreaded application (like a Web application). Having multiple threads able to read/write/delete to/from the cache opens up a huge can of worms and makes the usage of locks to manage cache access mandatory.

Radu Porumb
  • 785
  • 5
  • 7