Clean Way to Merge Two Typed C# Objects Based on Given Sequence

Question

I have two lists, each containing two objects of a class called TestData

TestData td1_a = new TestData("td1", "val11_a", null, "val13_a");
TestData td1_b = new TestData("td1", "val11_b", "val12_b", null);

TestData td2_a = new TestData("td2", "val21_a", "val22_a", null);
TestData td2_b = new TestData("td2", "val21_b", null, "val23_b");

List<TestData> list_a = new List<TestData>() {td1_a, td2_a};
List<TestData> list_b = new List<TestData>() {td1_b, td2_b};

TestData is defined as

public class TestData {
    public string DataName;
    public string Value1;
    public string Value2;
    public string Value3;
    public TestData(string name, string val1, string val2, string val3) {
        DataName = name;
        Value1 = val1;
        Value2 = val2;
        Value3 = val3;
    }
}

Note that the two lists each contain two TestData objects whose DataName values are "td1" and "td2"

How to merge the two lists such that the final result would be a single list whose members are two TestData objects where the objects with the same DataName merged and the value of the later override the value of the earlier if it is not null in a clean way?

So if I do mergeList(list_a, list_b), the result would be a list with members of:

TestData td1_merged = new TestData("td1", "val11_b", "val12_b", "val13_a");
TestData td2_merged = new TestData("td2", "val21_b", "val22_a", "val22_b");

That is, _b is replacing _a whenever possible

and if I reverse the order mergeList(list_b, list_a), the result would be a list having member of:

TestData td1_merged = new TestData("td1", "val11_a", "val12_b", "val13_a");
TestData td2_merged = new TestData("td2", "val21_a", "val22_a", "val22_b");

Where _a is replacing _b instead.

At this moment, this is the best I can do with LINQ aggregate:

    private List<TestData> mergeList(List<TestData> list_1, List<TestData> list_2) {
        return list_1.Concat(list_2) //combining list_1 and list_2 depends on the given sequence
            .GroupBy(td => td.DataName) //making groups based on DataName
            .Select(g => g.Aggregate(g.First(), (a, b) => { //merge the values of the elements
                if (b.Value1 != null) //tedious way of giving the conditions!
                    a.Value1 = b.Value1;
                if (b.Value2 != null)
                    a.Value2 = b.Value2;
                if (b.Value3 != null)
                    a.Value3 = b.Value3;
                return a;
            })).ToList();
    }

It works fine except for the fact that to make conditions for overriding when not null, I have to write it once for each object. Is there a cleaner way to do it?

Edited:

I do this because I encounter a problem where an object can be partially defined by separated developers on different files. My job is to create, but not to duplicate, the objects created in a smart way such that if the object is defined in more than one file, the later file additional definition will override the early ones, but not overriding everything (what has been defined in the earlier file remains whenever there is no update).

Example, if in one of the file, it is defined:

Chip_ResetSource.bits =[hw_pin=0, power_on=1, missing_clock=2, watchdog=3, software_force=4, comparator=5, convert_start_0=6].

And elsewhere, it is also defined:

Chip_ResetSource =type=uint8;policy=read;

And somewhere else,

Chip_ResetSource =address=2500;unit=count;formula=1;max=255;min=0;

Then I just need to combine all of them. But if in another (later) file there is additional info about Chip_ResetSource:

Chip_ResetSource =address=2501;

Then, while all other info of Chip_ResetSource must remain the same after the third info Chip_ResetSource =address=2500;unit=count;formula=1;max=255;min=0;, its address property must be changed from 2500 to 2501 due to the fourth info.

So, given that problem, I think if I can just create one method to read all the given property at the instantiation and then another method to cleanly merge when other file is read, that will be great!

So you have many different classes you would like to merge this way, but they are all strongly typed and don't share the same interface? As a side note, instead of merging "in place", it would make more sense to create a new merged instance and keep the original objects immutable (that's how LINQ is supposed to operate). — vgru, Dec 14 '15 at 08:10
I wonder if using Automapper here would be useful http://automapper.org/ — Jakub Holovsky, Dec 14 '15 at 08:12
@Groo yes, kind of, and I feel the way I merge them is a bit tedious. The number of elements other than the name in my given example is just three. But in reality, it is > 10. And there are multiple classes like that. So, yes, it is as you say. — Ian, Dec 14 '15 at 08:13
Why do you want to merge the two lists? If you tell us, we can probably suggest something to avoid this altogether. — Nathan Cooper, Dec 14 '15 at 08:23
@Nathan Sure, I would very much appreciate your help. I am not sure if comment section is enough to give proper explanation for it though. I have a set .properties files, each written by separate developers and put in certain folders under the same shared folder. The line in the one .properties file look like this: Chip_ResetSource.bits =[hw_pin=0, power_on=1, missing_clock=2, watchdog=3, software_force=4, comparator=5, convert_start_0=6]. This however, is not the only placed where the variable is defined. In other file it is also defined: Chip_ResetSource =type=uint8;policy=read; — Ian, Dec 14 '15 at 08:30
[continued] my job is to read all those files and to create proper objects accordingly but not to create duplicate objects. But since an object is defined in more than one file... this is where the problem started. Because I have to read the file one by one, and if I read certain **new** object name (in this case being Chip_ResetSource) I have to create the object first, regardless if the info given to me isn't complete to create such object. But if the additional info given on that object in the subsequent file, the later must override the earlier "in a smart" way. That is the situation. — Ian, Dec 14 '15 at 08:36
@Ian: are you sure these need to be strongly typed? Why don't you import them into property bags, i.e. dictionaries? It would make the whole thing much more flexible to work with. — vgru, Dec 14 '15 at 08:48
[continued] I understand that there is a way such as searching in the list where you first created the object if such object name is already exists. But even so, you will still need to input the property value one by one... So, if we can just read the files and create objects based on single file, and then later whatever duplicate can just be "smartly" merged, it would be easier to work on. — Ian, Dec 14 '15 at 08:50
@Groo mind to explain or to give example on the "property bags" i.e. dictionaries? as I am not sure how to do it that way... — Ian, Dec 14 '15 at 08:52
Ian, all this extra information, could you please edit it into the question so that the formatting will help readability? Thanks. — Richard Irons, Dec 14 '15 at 08:55
@RichardIrons Ok, give me a minute, I will add the information... — Ian, Dec 14 '15 at 08:56

score 1 · Accepted Answer · answered Dec 14 '15 at 15:58

One approach to simplify your problem would be to use a Dictionary instead of multiple strongly typed classes.

In that case, you would simply add values to the dictionary, and it would update (overwrite) any existing keys with new values:

static void Apply(IDictionary<string, string> properties, string input)
{
    // split by semicolon
    var props = input.Split(new[] { ';' }, StringSplitOptions.RemoveEmptyEntries);

    // add/merge each key-value pair into the dictionary
    foreach (var t in props)
    {
        var tokens = t.Trim().Split('=');
        var key = tokens[0];
        var value = tokens[1];

        // this will add a new value, or update the existing one
        properties[key] = value;
    }
}

This would store each value as a separate key-value pair, and you would use this similar to:

var properties = new Dictionary<string, string>();

Apply(properties, "type=uint8;policy=read;");
Apply(properties, "address=2500;unit=count;formula=1;max=255;min=0;");
Apply(properties, "address=2501;");

// dump the contents to screen to see what we've got now
Console.WriteLine(string.Join(";", properties.Select(x => $"{x.Key}={x.Value}")));

Regarding your .bits example, it's not clear whether you would treat .bits as a single string property and overwrite it completely in case it's found in a different file, or update its child properties like it's a nested object.

In this latter case, a simple approach would be to store the child properties exactly the same way, but perhaps prefix their key with "bits.", which would functionally be equivalent to:

Apply(properties, "bits.hw_pin=0; bits.power_on=1; bits.missing_clock=2;");

Thanks, it is a lot cleaner to store now! Somemore, I still can convert the dictionary into its proper class (with typed variables) after the reading from .properties files finished without instantiate it as soon as I read a **new** object name. This is really handy! :) — Ian, Dec 15 '15 at 01:36
@Ian: Great, that's cool. Also, if you actually need to keep these strongly typed classes around, it might be a good idea to automate populating their properties using reflection (something similar to [what's described here](http://stackoverflow.com/questions/1089123/setting-a-property-by-reflection-with-a-string-value)), iterating through each object's properties using `Type.GetProperties()`. This prevents you from having to hard code all these property names and having to update this code whenever a new property is added. — vgru, Dec 15 '15 at 08:42

Clean Way to Merge Two Typed C# Objects Based on Given Sequence

1 Answers1