3

I have 4 lists

a <- list(1,2,3,4)
b <- list(5,6,7,8)
c <- list(7,9,0)
d <- list(12,14)

I would like to know which of the lists have elements in common. In this example, lists b and c have the element 7 in common.

A brute force approach would be to take every combination of lists and find the intersection. Is there any other efficient way to do it in R?

Another approach would be to make a single list from all the lists and find the duplicates. Then maybe we could have a mapping function to indicate from which original lists these duplicates are from. But am not so sure about how to do it. I came across this post

Find indices of duplicated rows

I was thinking if we could modify this to find out the actual lists which have duplicates.

I have to repeat this process for many groups of lists. Any suggestions/ideas are greatly appreciated! Thanks in advance

zx8754
  • 52,746
  • 12
  • 114
  • 209
Dinesh
  • 2,194
  • 3
  • 30
  • 52

1 Answers1

8

What about using this double sapply?

l <- list(a,b,c,d)

sapply(seq_len(length(l)), function(x) 
  sapply(seq_len(length(l)), function(y) length(intersect(unlist(l[x]), unlist(l[y])))))
     [,1] [,2] [,3] [,4]
[1,]    4    0    0    0
[2,]    0    4    1    0
[3,]    0    1    3    0
[4,]    0    0    0    2

Interpretation: e.g. the element [1,2] of the matrix shows you how many elements the first element of the list l (in this case the sublist a) has in commom with the second list element (i.e. the sublist b)

Or alternatively just to see the indices of the sublists which have a common value with some other sublist:

which(sapply(seq_len(length(l)), function(x) length(intersect(l[[x]], unlist(l[-x])))) >= 1)
[1] 2 3
DatamineR
  • 10,428
  • 3
  • 25
  • 45
  • Thanks for the idea. I have a query - If `d <- list(8,14)`, then lists `b, c, d` have elements in common. I would like to get the output as lists `b,c,d` or 1,2,3. So should I search the matrix and concatinate? – Dinesh May 22 '15 at 23:04
  • Look at the alternative – DatamineR May 22 '15 at 23:10
  • if `d <- list(8,14)`, then the second alternative gives only 3,4 instead of 2,3,4. – Dinesh May 22 '15 at 23:15
  • 1
    @Dinesh Switch to `>=1` and you get 2,3,4. It sounds like you're describing a connected component of a graph. Maybe a specialized tool like the igraph package would serve you better. http://mathworld.wolfram.com/ConnectedComponent.html – Frank May 22 '15 at 23:17
  • Thanks @Frank and DatamineR, I have one more concern. If `d <- list(1,14)`, then I need to know that lists `a` and `d` have common and `c` and `d` have common elements. I am interested to know which groups of lists have elements in common. Let me know if I am not clear. – Dinesh May 22 '15 at 23:27
  • You can read this from the first solution. In which form do you want to have the final result? – DatamineR May 22 '15 at 23:41
  • Each list containing the names of lists which are in a group. From @Frank's link, I realised that I am indeed finding the connected components. – Dinesh May 22 '15 at 23:43
  • You could save the resul of the first alternative as `res` and the run `diag(res) <- 0`; `apply(res,1, function(x) which(x!=0))` – DatamineR May 22 '15 at 23:46
  • @DatamineR That finds neighbors, but not neighbors' neighbors, etc. (which all belong in a single connected component). I think it can be left to the OP for a separate question or a review of graph theory lit. – Frank May 23 '15 at 00:40
  • @Frank, DatamineR - Thanks a lot for your help. I have posted another question regarding the connected components here - http://stackoverflow.com/questions/30407769/get-connected-components-igraph-in-r – Dinesh May 23 '15 at 00:48
  • Any extension to three and four way intersections? – W7GVR Aug 22 '17 at 22:20
  • 1
    Could be simplified as: `sapply(l, function(x) sapply(l, function(y) length(intersect(x,y))))` – zx8754 Nov 29 '19 at 20:58
  • @zx8754 your solution gives me nice table with all my list names. But it would be really helpful if you can tell me how do i get the elements which are intersected between two sets.? – PesKchan Jan 28 '21 at 08:14