2
lists <- lapply(vector("list", 5), function(x) sample(1:100,50,replace=T))

How can i extract all values which are present in at least n (2,3,4,5) vectors inside lists (or generally in a population of vectors)?
For n=5, this question gives already a solution (e.g. intersect()), but is unclear to me for cases of n < m. In this particular case, i perform 5 variants of mean-comparisons between two groups, and want to extract a consensus between the 5 tests (e.g. significantly different in at least 3 tests).

pogibas
  • 27,303
  • 19
  • 84
  • 117
nouse
  • 3,315
  • 2
  • 29
  • 56

2 Answers2

3

If I understand correctly, you can do it as follows. Assume that you are interested in values that are shared between at least 3 list elements.

combos <- combn(seq_along(lists), 3, simplify = FALSE)
lapply(combos, function(i) Reduce(intersect, lists[i]))

And if you're just interested in the actual values,

unique(unlist(lapply(combos, function(i) Reduce(intersect, lists[i]))))

In combos we store all possible combinations of your lists of length n (here, 3).

talat
  • 68,970
  • 21
  • 126
  • 157
  • The line starting with "unique" is what i wanted. This should give me the values which appear in at least 3 of the 5 list elements, correct? – nouse Feb 09 '18 at 13:54
  • 1
    You can also do it all inside `combn`, i.e. `combn(seq_along(lists), 3, simplify = FALSE, FUN = function(i) Reduce(intersect, lists[i]))` – Sotos Feb 09 '18 at 13:54
2

You can simply reduce lists using unique then combine them into one vector with unlist and count with table.

n <- 3
names(which(table(unlist(lapply(lists, unique))) >= n))

Output of this code is vector of names.

pogibas
  • 27,303
  • 19
  • 84
  • 117