6

I have wrote the following code:

IEnumerable<string> blackListCountriesCodes = 
pair.Criterion.CountriesExceptions.Select(countryItem => countryItem.CountryCode);

IEnumerable<string> whiteListCountriesCodes = 
pair.Criterion.Countries.Select(countryItem => countryItem.CountryCode);

return (!blackListCountriesCodes.Contains(Consts.ALL.ToString()) &&
        !blackListCountriesCodes.Contains(country) &&
        (whiteListCountriesCodes.Contains(Consts.ALL.ToString()) ||
        whiteListCountriesCodes.Contains(country)));

resharper shows me a warning: Possible duplicate enumeration of IEnumerable

What does this mean? Why is this a warning?

AakashM
  • 62,551
  • 17
  • 151
  • 186
Elad Benda
  • 35,076
  • 87
  • 265
  • 471
  • http://confluence.jetbrains.net/display/ReSharper/Possible+multiple+enumeration+of+IEnumerable (which is where you will go if you choose "Why is ReSharper suggesting this?" from the ReSharper inspection menu) – AakashM Dec 20 '12 at 15:38
  • Also this SO: http://stackoverflow.com/questions/8240844/handling-warning-for-possible-multiple-enumeration-of-ienumerable – Davin Tryon Dec 20 '12 at 15:38
  • Here you have a trade off - unless the amount of data is huge (which I doubt - AFAIK countries in world < 1000) you can materialize `blackListCountriesCodes` and `whiteListCountriesCodes` before checking the predicates – StuartLC Dec 20 '12 at 15:41
  • @AakashM I want to get it clear: The code "remembers" it did this query already and will not re-do it? doesn't this cause not up-to-date data? – Elad Benda Dec 21 '12 at 17:45

3 Answers3

9

LINQ Queries defer their execution until you do something with the results. In this case, calling Contains() twice on the same collection could cause the results to be enumerated twice which, depending on the query, could cause performance issues.

You can solve this by simply tacking a ToList() call on the end of your query which will force execution of the query and store the results a single time.

Justin Niessner
  • 242,243
  • 40
  • 408
  • 536
  • I still don't get it. i thought I should prefer lazy execution, no? If I apply `ToList()` and execute the first query, I lose the benefits of lazy execution. How can eager eaxecution solve performance issue? It sounds to be the other way arround. no? – Elad Benda Dec 21 '12 at 17:37
  • @EladBenda - If you wait for lazy execution, and you're querying a database, the query will be executed needlessly at the database twice. If you call ToList, the query may be executed immediately, but the results will be stored and the query executed only once. – Justin Niessner Dec 21 '12 at 17:38
  • The code "remembers" it did this query already and will not re-do it? doesn't this cause not up-to-date data? – Elad Benda Dec 21 '12 at 17:45
1

This means that your code may enumerate the blackListCountriesCodes and whiteListCountriesCodes several times. Since LINQ uses deferred evaluation, this may cause slowness, especially when the pair has lots of data, and Where clauses are complex (it does not look like any of that applies in your situation, though).

You can eliminate the warning (and the alleged slowness) by "materializing" the enumerations into lists, like this:

var blackListCountriesCodes = 
    pair.Criterion.CountriesExceptions.Select(countryItem => countryItem.CountryCode).ToList();

var whiteListCountriesCodes = 
    pair.Criterion.Countries.Select(countryItem => countryItem.CountryCode).ToList();
Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • how come defer execution is slower? It sounds to be the other way arround. no? If I force execution- the code "remembers" it did this query already and will not re-do it? doesn't this cause not up-to-date data? – Elad Benda Dec 22 '12 at 18:08
  • 1
    @EladBenda That depends a lot on where the data for the deferred execution comes from: if it comes from a list of 1000000 items of which 999900 are filtered out, then eagerly selecting 100 items into a separate list would be a significant performance improvement. Same goes for deferred read from RDBMS. – Sergey Kalinichenko Dec 22 '12 at 19:26
1

It means that you might calculate the content of the IEnumerable twice (or more). If this is expensive, like calling a database this is bad for performance. If the underlying source is a List<T> and there is a simple projection or something else non-expensive this is not a problem.

You could of course rewrite the expression to use the enumerables only once.

!blackListCountriesCodes.Contains(Consts.ALL.ToString())
    && !blackListCountriesCodes.Contains(country)

could be rewritten as

!blackListCountriesCodes
    .Where(blcc => blcc == Consts.ALL.ToString() || blcc == country).Any()
Albin Sunnanbo
  • 46,430
  • 8
  • 69
  • 108
  • I still don't get it. i thought I should prefer lazy execution, no? If I apply `ToList()` and execute the first query, I lose the benefits of lazy execution. How can eager eaxecution solve performance issue? It sounds to be the other way arround. no? – Elad Benda Dec 21 '12 at 17:38
  • If you only want to use the result 0-1 times and maybe only use parts of the result you should use Lazy evaluation, but if want to use the result many times you should not use lazy, at least not lazy where you recalculate the content. You could still use `Lazy` to instantiate on first use. – Albin Sunnanbo Dec 22 '12 at 15:23
  • you mean the The code "remembers" it did this query already and will not re-do it? doesn't this cause not up-to-date data? – Elad Benda Dec 22 '12 at 18:07