4

(Full code available at: https://dotnetfiddle.net/tdKNgH)

I have two lists that are associated by ParentName and I would like to join them in a specific way.

class Parent
{
    public string ParentName { get; set; }
    public IEnumerable<string> ChildNames { get; set; }
}

class Child
{
    public string ParentName { get; set; }
    public string ChildName { get; set; }
}

var parents = new List<Parent>()
{
    new Parent() {ParentName = "Lee"},
    new Parent() {ParentName = "Bob"},
    new Parent() {ParentName = "Tom"}
};

var children = new List<Child>()
{
    new Child() {ParentName = "Lee", ChildName = "A"},
    new Child() {ParentName = "Tom", ChildName = "B"},
    new Child() {ParentName = "Tom", ChildName = "C"}
};

I'm using a foreach loop to join, and it works, but is there a more succinct way to do it?

foreach (var parent in parents)
{
    var p = parent; // to avoid foreach closure side-effects
    p.ChildNames = children.Where(c => c.ParentName == p.ParentName)
                           .Select(c => c.ChildName);
}

Here's what the resulting parents list would look like:

Parent Children
------ --------
Lee    A 
Bob    (empty) 
Tom    B,C
Lee Grissom
  • 9,705
  • 6
  • 37
  • 47
  • You may want to consider using a Dictionary as in http://stackoverflow.com/questions/2101069/c-sharp-dictionary-one-key-many-values – NoChance Sep 27 '14 at 00:01
  • 1
    +1 for the dictionary approach (yup that's you, Emmad), makes the code more self-evident. But the code that you have is actually quite OK. I'd even argue that it's a heck of a lot safer than other "more elegant" solutions out there. – code4life Sep 28 '14 at 21:31
  • You can change the `foreach` to a `parents.Select...`: `parents.Select (p => new Parent { ParentName = p.ParentName, ChildNames = children.Where (c => c.ParentName == p.ParentName).Select (c => c.ChildName) });`. – keenthinker Sep 28 '14 at 21:51

5 Answers5

1

You can add an extension method for enumerables:

public static void Each<T>(this IEnumerable<T> source, Action<T> action)
{
    if (action == null)
        return;
    foreach (T obj in source)
        action(obj);
}

And then do:

parents.Each(p => p.ChildNames = children.Where(c => c.ParentName == p.ParentName)
                                         .Select(c => c.ChildName));
Candide
  • 30,469
  • 8
  • 53
  • 60
1

You could do a group join. LINQ isn't meant to update, though. So I'm not sure whether this would actually get you anywhere useful.

IEnumerable<Parent> parents = ...;

var parentsWithChildren = parents.GroupJoin(children,
                                            c => c.ParentName,
                                            c => c.ParentName,
                                            (a, b) => new
                                                      {
                                                          Parent = a,
                                                          ChildNames = b.Select(x => x.ChildName)
                                                      });

foreach (var v in parentsWithChildren)
{
    v.Parent.ChildNames = v.ChildNames;
}

This would certainly help if all you were given were parent names and children, rather than full Parent objects, since then you could just group join that collection to the child names, and create instances of parents where I create an anonymous type ((a, b) => new { ... }). But since I'm assuming your Parent objects would realistically hold more than just a name and that this is just an example, this seems like your best bet.

Matthew Haugen
  • 12,916
  • 5
  • 38
  • 54
  • +1. Yeah, that was the only other option I could think of too. I have it included in my DotNetFiddle sample using alternate linqy syntax, but it's pretty much identical to your code (you're fast!). :) If no-one else comes up with a better answer, I'll mark yours. – Lee Grissom Sep 27 '14 at 00:24
1

Consider calling the name of a parent as Parent.Name instead of Parent.ParentName(a parent's parent?), Child has the same problem...

class Parent
{
    public string Name { get; set; }
    public IEnumerable<string> ChildrenNames { get; set; }
}

class Child
{
    public string ParentName { get; set; }
    public string Name { get; set; }
}

You can completely avoid foreach by creating parentNames array first:

var parentNames = new[] { "Lee", "Bob", "Tom" };
var allChildren = new List<Child>()
{
    new Child() {ParentName = "Lee", Name = "A"},
    new Child() {ParentName = "Tom", Name = "B"},
    new Child() {ParentName = "Tom", Name = "C"}
};

Such that the parents are constructed entirely by LINQ without any side-effects (no updates to any variables), and it should be very simple:

var parents =
    from parentName in parentNames
    join child in allChildren on parentName equals child.ParentName into children
    select new Parent { Name = parentName, ChildrenNames = children.Select(c => c.Name) };
Ken Hung
  • 752
  • 5
  • 13
  • +1. Thanks Ken. Your solution is called a Group Join, which is pretty much the only other option I could think of too. @MatthewHaugen came up with the same idea (using alternate syntax). – Lee Grissom Sep 27 '14 at 01:27
1

Given that LINQ is based on functional principles, side effects are generally a big no-no (and also the reason for why there's no foreach method).

I therefore suggest the following solution:

var parents = new List<Parent>()
{
    new Parent() { ParentName = "Lee" },
    new Parent() { ParentName = "Bob" },
    new Parent() { ParentName = "Tom" }
};

var children = new List<Child>()
{
    new Child() { ParentName = "Lee", ChildName = "A" },
    new Child() { ParentName = "Tom", ChildName = "B" },
    new Child() { ParentName = "Tom", ChildName = "C" }
};

var parentsWithChildren = parents.Select(x => new Parent 
{ 
    ParentName = x.ParentName, 
    ChildNames = children
        .Where(c => c.ParentName == x.ParentName)
        .Select(c => c.ChildName) 
});

foreach (var parent in parentsWithChildren)
{
    var childNamesConcentrated = string.Join(",", parent.ChildNames);

    var childNames = string.IsNullOrWhiteSpace(childNamesConcentrated) 
        ? "(empty)" : childNamesConcentrated;

    Console.WriteLine("Parent = {0}, Children = {1}", parent.ParentName, childNames);
}

The solution above, do not modify the Parent objects of the collection parents by setting their ChildNames. Instead it creates a new set of Parents with their respective ChildNames.

ebb
  • 9,297
  • 18
  • 72
  • 123
  • "also the reason why there is no `foreach` method" - I beg to differ. There's no `.ForEach` method at the `IEnumerable` level. But even then, if you PLINQ it, you can call `ForAll`. – code4life Sep 28 '14 at 21:29
  • @code4life, Please see http://blogs.msdn.com/b/ericlippert/archive/2009/05/18/foreach-vs-foreach.aspx – ebb Sep 29 '14 at 06:43
  • so you're saying that `myCollection.Select(x=>something(x)).ToList().AsParallel().ForAll(...)` doesn't exist? – code4life Sep 29 '14 at 14:19
  • @code4life, Not at all. `PLINQ` _needs_ to have the `ForAll` method, in order to iterate through the collection in parallel - which is an exception (note that I wrote `side effects are GENERALLY a big no-no`). The link I provided simply confirms what my answer says: That `LINQ` is build around functional principles, and therefore a `ForEach` method is not available - and shouldn't. – ebb Sep 29 '14 at 16:52
  • Immutability is a good design pattern and I appreciate your emphasis on it. – Lee Grissom Sep 30 '14 at 17:44
1

You can use ToLookup for best performance with a little memory penalty:

 var clu = children.ToLookup(x => x.ParentName, x => x.ChildName);
 parents.ForEach(p => p.ChildNames = clu[p.ParentName]);
Lee Grissom
  • 9,705
  • 6
  • 37
  • 47
brz
  • 5,926
  • 1
  • 18
  • 18
  • +1. Ah I see now, yes I like this. Immutability purists will hate it, but the key here is not the Foreach method, but rather using the "ToLookup" to perform a succinct GroupJoin and I agree with the performance/memory tradeoff. – Lee Grissom Sep 30 '14 at 17:53