1

Below are the records where we are trying to group the records by the following OR conditions:

  1. Name is same
  2. Email is same
  3. Phone is same

Is there a way in LINQ to Group By with Or condition?

Name           Email            Phone             Id
---            ---------        ------------      ----------
Rohan          rohan@s.com      NULL              1  
R. Mehta       rohan@s.com      9999999999        2
Alex           alex@j.com       7777777777        3  
Lisa John      john@j.com       6666666666        4
Lisa           lisa@j.com       6666666666        5
Siri           siri@s.com       NULL              6
RM             info@s.com       9999999999        7
Lisa           NULL             NULL              8
Lisa John      m@s.com          7777777757        9

Output Expected

Group 1:
Key: Rohan
RecordIds: 1,2,7  (As `Id:1` has same email as `Id:2`, `Id:2` has same 
                    phone number as `Id:7`.)

Group 2:
Key: Lisa John
RecordIds: 4,5,8,9  (As `Id:4` has same phone number as `Id:5`. While `Id:5` 
                    has the same name as `Id:8`. As `Id:9` has the same name 
                    as `Id: 4`, include that)
  1. 3 and 6 are not part of the output as the output are only group with more than 1 record
  2. Key can be anything I just put in a random key.

If record 9 had email-id: rohan@s.com then:

Output

Group 1:
Key: Rohan
RecordIds: 1,2,7,4,5,8,9

NOTE: Input is SQL table to be read through LINQ to SQL. So query performance too has to be taken into account.

Crud Solution:

A dirty solution would be the following:

  1. Group the records by Name -> store result in var gl-1
  2. Group the records by Email -> store result in var gl-2
  3. Group the records by Phone -> store result in var gl-3
  4. Take each result in gl-1 check if corresponding id is present in gl-2,gl-3. If so include those ids in gl-1
  5. Take each result in gl-2 check if corresponding id is present in any result in gl-1 is so, include the exclusive ids to gl-1 record. If the loop encounters a result which is not present in gl-1, include it as a result in gl-1.
  6. Do step 5 for gl-3.
Shyamal Parikh
  • 2,988
  • 4
  • 37
  • 78
  • 2
    Possible duplicate of [Group By Multiple Columns](https://stackoverflow.com/questions/847066/group-by-multiple-columns) – Liam Aug 02 '19 at 12:51
  • Agreed @MattRowland. This is more like the LINQ version of this SQL question: https://stackoverflow.com/questions/10763043/how-to-group-by-with-a-special-condition – Canica Aug 02 '19 at 12:55
  • check this https://stackoverflow.com/questions/19703034/linq-getting-customers-group-by-date-and-then-by-their-type – Haithem KAROUI Aug 02 '19 at 13:16
  • OK - based on your sample data - you won't be able to solve that with Entity Framework or LINQ to SQL. You are going to have to pull it into RAM and solve it with C#. – mjwills Aug 02 '19 at 13:46
  • as you are dealing with connections inside a graph, [this](https://stackoverflow.com/questions/35254260/how-to-find-all-connected-subgraphs-of-an-undirected-graph) could have an answer for you – Marty Aug 03 '19 at 16:30

2 Answers2

2

GroupBy requires some definition of "equality". You could define an EqualityComparer with the logic you want, but you'll get inconsistent results. Your grouping breaks the transitive property of equality needed for grouping. In other words, if A=B and B=C then A=C must be true.

For example, the following pairs of items would be in the same group ("equal"):

A, B, C  and  A, D, E
A, D, E  and  F, G, E

but

A, B, C  and  F, G, E

would not be in the same group.

To get the output you want (e.g. item 9 in multiple groups) you'd need to use standard looping to recursively find all items that are "equal" to the first, then all items that are "equal" to that group, then all items that are "equal" to the third group, etc. Linq is not going to be very helpful here (except possibly for the searching within each recursive call).

D Stanley
  • 149,601
  • 11
  • 178
  • 240
0

Linq queries run linear which means once it has passed a new possible group it cant go back and work with it.

lets asume

 public class aperson
{
    public string Name;
    public string Email;
    public string Phone;
    public int ID;

    public aperson(string name,string email,string phone,int id)
    {
        Name = name;
        Email = email;
        Phone = phone;
        ID = id;
    }
}

example

 new aperson("a","a@","1",1),
 new aperson("b","b@","2",2),
 new aperson("a","c@","2",3)

Iteration 1: create group 1 with ("a","a@","1") values
Iteration 2: create group 2 with ("b","b@","2") values
Iteration 3: here the system will have to group it with either group 1 or with group 2 but not both.

To fix this your iterator will have to go back to group 2 and group 1 and join them.

To solve this you will have to break it into steps.

Step1. Create the groups

Step2. Group by the created groups.

I think there are much better ways to do this. I am just illustrating the flow how this problem needs to be approached and why.

Code for solution

    public static Dictionary<string, int> group = new Dictionary<string, int>();

    public static void adduniquevalue(aperson person,int id)
    {

        if (person.Email != null && !group.Keys.Contains(person.Email))
        {
            group.Add(person.Email, id);
        }
        if (person.Phone != null && !group.Keys.Contains(person.Phone))
        {
            group.Add(person.Phone, id);
        }
        if (person.Name != null && !group.Keys.Contains(person.Name))
        {
            group.Add(person.Name, id);
        }
    }

    public static void CreateGroupKeys(aperson person)
    {
        int id = group.Count;
        List<int> groupmatches = new List<int>();
        if (person.Email != null && group.Keys.Contains(person.Email)) 
            groupmatches.Add(group[person.Email]);  
        if (person.Phone != null && group.Keys.Contains(person.Phone)) 
            groupmatches.Add(group[person.Phone]); 
        if (person.Name != null && group.Keys.Contains(person.Name)) 
            groupmatches.Add(group[person.Name]); 
        if (groupmatches.GroupBy(x=>x).Count() > 1)
        {
            int newid = groupmatches[0];
            group.Keys.Where(key => groupmatches.Contains(group[key]))
                      .ToList()
                      .ForEach(key => { group[key] = newid; }); 
        }
        if (groupmatches.Count == 0)
          adduniquevalue(person, id);
        else adduniquevalue(person, groupmatches[0]);
    }

    public static int GetGroupKey(aperson person)
    {
        if (person.Email != null && group.Keys.Contains(person.Email))
            return group[person.Email]; 
        if (person.Phone != null && group.Keys.Contains(person.Phone))
            return group[person.Phone]; 
        if (person.Name != null && group.Keys.Contains(person.Name))
            return group[person.Name];
        else return 0;
    }

This will create your groups in a dictionary which you could use in a normal group by method later on.

Like so:

 people.ForEach(x => CreateGroupKeys(x));
 var groups = people.GroupBy(x => GetGroupKey(x)).ToList();
Neil
  • 641
  • 1
  • 7
  • 21