8

I've been trying for a long time to find a "clean" pattern to handle a .SelectMany with anonymous types when you don't always want to return a result. My most common use case looks like this:

  1. We have a list of customers that I want to do reporting on.
  2. Each customer's data resides in a separate database, so I do a parallel .SelectMany
  3. In each lambda expression, I gather results for the customer toward the final report.
  4. If a particular customer should be skipped, I need to return a empty list.
  5. I whip these up often for quick reporting, so I'd prefer an anonymous type.

For example, the logic may looks something like this:

//c is a customer
var context = GetContextForCustomer(c);
// look up some data, myData using the context connection
if (someCondition)
  return myData.Select(x => new { CustomerID = c, X1 = x.x1, X2 = x.x2 });
else
  return null;

This could be implemented as a foreach statement:

var results = new List<WhatType?>();
foreach (var c in customers) {
  var context = GetContextForCustomer(c);
  if (someCondition)
    results.AddRange(myData.Select(x => new { CustomerID = c, X1 = x.x1, X2 = x.x2 }));
}

Or it could be implemented with a .SelectMany that is pre-filtered with a .Where:

customers
  .Where(c => someCondition)
  .AsParallel()
  .SelectMany(c => {
     var context = GetContextForCustomer(c);
     return myData.Select(x => new { CustomerID = c, X1 = x.x1, X2 = x.x2 });
  })
  .ToList();

There are problems with both of these approaches. The foreach solution requires initializing a List to store the results, and you have to define the type. The .SelectMany with .Where is often impractical because the logic for someCondition is fairly complex and depends on some data lookups. So my ideal solution would look something like this:

customers
  .AsParallel()
  .SelectMany(c => {
     var context = GetContextForCustomer(c);
     if (someCondition)
       return myData.Select(x => new { CustomerID = c, X1 = x.x1, X2 = x.x2 });
     else
       continue?   return null?   return empty list?
  })
  .ToList();

What do I put in the else line to skip a return value? None of the solutions I can come up with work or are ideal:

  1. continue doesn't compile because it's not an active foreach loop
  2. return null causes an NRE
  3. return empty list requires me to initialize a list of anonymous type again.

Is there a way to accomplish the above that is clean, simple, and neat, and satisfies all my (picky) requirements?

mellamokb
  • 56,094
  • 12
  • 110
  • 136

5 Answers5

2

You could return an empty Enumerable<dynamic>. Here's an example (though without your customers and someCondition, because I don't know what they are, but of the same general form of your example):

new int[] { 1, 2, 3, 4 }
    .AsParallel()
    .SelectMany(i => {
        if (i % 2 == 0)
            return Enumerable.Repeat(new { i, squared = i * i }, i);
        else
            return Enumerable.Empty<dynamic>();
        })
    .ToList();

So, with your objects and someCondition, it would look like

customers
    .AsParallel()
    .SelectMany(c => {
        var context = GetContextForCustomer(c);
        if (someCondition)
            return myData.Select(x => new { CustomerID = c, X1 = x.x1, X2 = x.x2 });
        else
            return Enumerable.Empty<dynamic>();
       })
    .ToList();
Ben Allred
  • 4,544
  • 1
  • 19
  • 20
  • This will cause a lot of dynamic calls, and in turn, reflection. Performance-wise, this is not the best option. – Athari Nov 23 '13 at 05:45
  • 1
    Good point. But he also says he just needs this to generate some quick reports. – Ben Allred Nov 23 '13 at 05:47
  • I like this solution the best, because it's short and doesn't require any custom extension methods. I had tried `Enumerable.Empty()` but it couldn't determine the type, without ``. Thanks!! – mellamokb Nov 23 '13 at 05:57
  • 3
    if you do this, realize the eventual return type will be `IEnumerable` as well. This might work fine depending on your use case. – Eren Ersönmez Nov 23 '13 at 06:04
2

Without knowing what someCondition and myData look like...

Why don't you just Select and Where the contexts as well:

customers
.Select(c => GetContextForCustomer(c))
.Where(ctx => someCondition)
.SelectMany(ctx => 
    myData.Select(x => new { CustomerID = c, X1 = x.x1, X2 = x.x2 });

EDIT: I just realized you need to carry both the customer and context further, so you can do this:

customers
.Select(c => new { Customer = c, Context = GetContextForCustomer(c) })
.Where(x => someCondition(x.Context))
.SelectMany(x => 
    myData.Select(d => new { CustomerID = x.Customer, X1 = d.x1, X2 = d.x2 });
Eren Ersönmez
  • 38,383
  • 7
  • 71
  • 92
  • I realize my example doesn't do it justice. In reality, the logic behind `someCondition` and `myData` is 20-30 lines of code and rapidly evolving, so I like it best all together in the `.SelectMany`. – mellamokb Nov 23 '13 at 06:03
1

You can try following:

customers
  .AsParallel()
  .SelectMany(c => {
     var context = GetContextForCustomer(c);
     if (someCondition)
       return myData.Select(x => new { CustomerID = c, X1 = x.x1, X2 = x.x2 });
     else
       return Enumerable.Empty<int>().Select(x => new { CustomerID = 0, X1 = "defValue", X2 = "defValue" });
  })
  .ToList();

All anonymous types with the same set of properties (the same names and types) are combined into one one anonymous class by compiler. That's why both your Select and the one on Enumerable.Empty will return the same T.

MarcinJuraszek
  • 124,003
  • 15
  • 196
  • 263
  • This solution is basically in the same boat as the `foreach` and the `return` empty list - you have to dynamically construct a list of an anonymous type. I am aware of solutions like [this one](http://stackoverflow.com/questions/612689), for instance. It works, but now I have two property lists I have to update if I decide to add/remove a property in my on-the-fly report. – mellamokb Nov 23 '13 at 06:12
1

You can create your own variarion of SelectMany LINQ method which supports nulls:

public static class EnumerableExtensions
{
    public static IEnumerable<TResult> NullableSelectMany<TSource, TResult> (
        this IEnumerable<TSource> source,
        Func<TSource, IEnumerable<TResult>> selector)
    {
        if (source == null) 
            throw new ArgumentNullException("source");
        if (selector == null) 
            throw new ArgumentNullException("selector");
        foreach (TSource item in source) {
            IEnumerable<TResult> results = selector(item);
            if (results != null) {
                foreach (TResult result in results)
                    yield return result;
            }
        }
    }
}

Now you can return null in the selector lambda.

Athari
  • 33,702
  • 16
  • 105
  • 146
  • This is a nice approach. The other thing I had considered was `.Select(...).Where(x => x != null).SelectMany(x => x)`. Not as pretty, but doesn't require me to specify the anonymous type anywhere. – mellamokb Nov 23 '13 at 05:59
  • 1
    @mellamokb ReSharper suggested that to me actually. :) I also have many extension methods for `IEnumerable`, so I'd write something like `.Select(...).WhereNotNull().Flatten()` myself, which looks pretty enough for me. – Athari Nov 23 '13 at 06:05
0

The accepted answer returns dynamic. The cleanest would be to move the filtering logic into a Where which makes the whole thing look better in linq context. Since you specifically rule that out in the question and I'm not a fan of delegates written over multiple lines in a linq call I will try this, but one can argue its more hacky.

var results = new 
{ 
    customerID = default(int), //notice the casing of property names
    x1 = default(U), //whatever types they are
    x2 = default(V) 
}.GetEmptyListOfThisType();

foreach (var customerID in customers) {
  var context = GetContextForCustomer(customerID);
  if (someCondition)
    results.AddRange(myData.Select(x => new { customerID, x.x1, x.x2 }));
}

public static List<T> GetEmptyListOfThisType<T>(this T item)
{
    return new List<T>();
}

Notice the appropriate use of property names which is in accordance with other variable names, hence you dont have to write the property names a second time in the Select call.

Community
  • 1
  • 1
nawfal
  • 70,104
  • 56
  • 326
  • 368