4

I have two lists

List1 Only two property. Cant use Dictionary since there might be duplicate keys. The combination of Property1 and Property2 is unique.

public class List1
{
    public string Property1 { get; internal set; }
   public string Property2 { get; internal set; }
}

public class List2
{
    public string Property1 { get; internal set; }
    public string Property2 { get; internal set; }
    public string Property3 { get; internal set; }
}

List<List1> mylist1 = new List<List1>() {
    new List1() {Property1="664",Property2="Ford" },
    new List1() {Property1="665",Property2="Ford" },
    new List1() {Property1="664",Property2="Toyota" },
};

List<List2> mylist2 = new List<List2>() {
    new List2() {Property1="664",Property2="Ford" ,Property3="USA"},
    new List2() {Property1="665",Property2="Ford" ,Property3="USA"},
    new List2() {Property1="664",Property2="Toyota" ,Property3="USA"},
    new List2() {Property1="666",Property2="Toyota" ,Property3="USA"},
};

I need to get the matching items in mylist1 and mylist2. The match should happen only on Property1 and Property2. Property3 in the mylist2 can be ignored during comparison.

Currently I use

var matchingCodes = mylist1.Where(l1 => mylist2.Any(l2 => (l2.Property1 == l1.Property1 && l2.Property2==l1.Property2))).ToList();

which works perfectly fine. But is there a better way/ fastest way to do this?

I can change List1 to any other Type. but not List2.

CSharpie
  • 9,195
  • 4
  • 44
  • 71
Gokul
  • 1,361
  • 3
  • 19
  • 31

4 Answers4

3

You are trying to do set operations on your data lists in LINQ. There are four LINQ function calls that you can use to make the current code cleaner as well as succinct. These operations are:

  • Except
  • Union
  • Intersect
  • Distinct

The one you are looking for is Intersect which is

Returns the set of values found t be identical in two separate collections Source

Finally, if you are always going to use those specific properties to detect equality and/or uniqueness you will need to override Equals for List1 and/or List2 classes. This would depend who is considered on the left hand of the Intersect (the variable before the .) and who is on the right hand of the Intersect (the variable passed into the function).

Here is a SO answer to how to override the Equals function if you do not know how to do so. Coincidentally, it also has an Intersect example.

Community
  • 1
  • 1
Martin Hollstein
  • 528
  • 4
  • 10
  • For OP case these would require custom `IEqualityComparer` implementation though. The join solutions based on anonymous type are much easier and flexible. – Ivan Stoev Feb 17 '17 at 19:27
  • I was editing that in when before I saw the comment. Should be there now. Although I went the Equals route. Also, it was asked if there was a 'better' way to do it. Which is highly opinionated as it is easier to read in code than all the `join` answers. – Martin Hollstein Feb 17 '17 at 19:29
3

The easiest way to do it in Linq which is relatively fast, or atleast faster than your approach is using Join or GroupJoin like so:

List<List1> matchingCodes = mylist1.GroupJoin(mylist2,

               l1 => new { l1.Property1, l1.Property2 },// Define how the key from List1 looks like
               l2 => new { l2.Property1, l2.Property2 },// Define how the key from List2 looks like

               // Define what we select from the match between list1 and list2
               (l1Item, l2Items) => l1Item).ToList();

Simplified, this creates two dictionaries which are then joined together.

GroupJoin works better here as it gives you the item from List1 and all matching from list2.

A regular Join would return the same item from List1 per match from List2.

See also Enumerable.GroupJoin (C# Reference)

Note this is the equivalent to @octavioccl's answer. Also this example assumes, the names of the properties from both classes are equal. If they arent you have to modifiy they keyselectors abit like so:

l1 => new { A=l1.Foo, B=l1.Bar},
l2 => new { A=l2.Herp, B=l2.Derp},
Community
  • 1
  • 1
CSharpie
  • 9,195
  • 4
  • 44
  • 71
  • 1
    I did a quick benchmark with my method and the above suggested method. My method - 54ms. suggested - 112 ms – Gokul Feb 17 '17 at 19:45
  • 1
    @Gokul that really depends on the ammount of items within those lists. If the number is small, the join has probably more overhead creating those hashtables before doing its work. My answer also assumed your lists contain more that a few items othwerwise it wouldnt make much sense asking for a more performant solution. – CSharpie Feb 17 '17 at 19:47
  • mylist1 may contain more than 1000 - 10,000 items. but mylist2 will have 1- 10 items. – Gokul Feb 17 '17 at 19:50
  • 1
    @Gokul then there isnt much performance to gain since the ammount of iterations is leatively low even with your attempt. It would be totally different if both lists were 1000+ items. I hope you used StopWatch and not DateTime.Now for your performance tests since i cannot reproduce such a big difference. – CSharpie Feb 17 '17 at 19:56
  • @Gokul just test it once with not a single matching item in list2 and you should see a massive difference. – CSharpie Feb 17 '17 at 20:15
  • I used Stopwatch. Iterations -1000. list1 - 2000 items, list 1 - 3 items (all 3 not matching) My method - 180ms/ your method - 183 ms. I do see a difference if I do just 1 iteration my method - 2813 ticks. yours - 1484 ticks – Gokul Feb 17 '17 at 20:18
3

You could also do a join:

var query= from l in mylist1
           join e in mylist2 on new {Property1=l.Property1,Property2=l.Property2} equals new {Property1=e.Property1,Property2=e.Property2}
           select l;
ocuenca
  • 38,548
  • 11
  • 89
  • 102
  • You can omit the 'names' making it a bit more readable i guess. `new {Property1=l.Property1,Property2=l.Property2}` => `new {l.Property1, l.Property2}` – CSharpie Feb 17 '17 at 19:35
  • Thanks @CSharpie, I'm aware about that, but if the property's names are different between those two classes, then it is necessary do it this way. I'm assuming those are not the real names, but if they the names match, yes, property names in the anonymous types can be omited – ocuenca Feb 17 '17 at 19:40
  • You are right, that makes your answer more waterproof. – CSharpie Feb 17 '17 at 19:41
  • I did a quick benchmark with my method and the above suggested method. My method - 54ms. suggested - 116 ms – Gokul Feb 17 '17 at 19:44
0

Both properties are strings so that you can just create a dictionary with a key the concatenation of those properties with a Value of the actual item.

So for each item in the other list you can just look up in the dictionary for the concatenation of their property, if there is a match, you compare with the item found

Stefan Georgiev
  • 170
  • 1
  • 4
  • I cannot commend elsewhere so...The **fastest** approaches should be this one and the suggested with Sets. Sets are cleaner but you have two different types of objects - if you can go around that - use sets. Just remember to override GetHashcode. What i suggest should be as fast and would not require for you to make changes to the actual classes. I cannot provide an example - I do not have VS installed on that machine – Stefan Georgiev Feb 18 '17 at 07:41