I still find this question under-developed, but you can probably get towards where you're trying to be with a combination of stack
+ table
+ colSums
or rowSums
, depending on your need(s).
Some sample data:
mylist <- list(c("aaab", "aaab", "aaab", "abcd"),
c("defg", "defg", "defg", "abcd"),
c("ghgh", "ghgh", "ghgh", "abcd"),
c("aaaa", "aaaa", "aaaa", "aaaa"))
stack
puts this into a long data.frame
with two columns, "ind" and "values". "ind" corresponds to the list
index number, while "value" refers to the... value.
X <- stack(setNames(mylist, seq_along(mylist)))
Using table
gives us the frequency of each term by "ind".
table(X)
# ind
# values 1 2 3 4
# aaaa 0 0 0 4
# aaab 3 0 0 0
# abcd 1 1 1 0
# defg 0 3 0 0
# ghgh 0 0 3 0
colSums
would tell us which list items have duplicated items within themselves.
colSums(table(X) > 0)
# 1 2 3 4
# 2 2 2 1
which(colSums(table(X) > 0) > 1)
# 1 2 3
# 1 2 3
rowSums
would tell us which list items have duplicated items among themselves.
rowSums(table(X) > 0)
# aaaa aaab abcd defg ghgh
# 1 1 3 1 1
which(rowSums(table(X) > 0) > 1)
# abcd
# 3
names(which(table(X)["abcd", ] >= 1))
# [1] "1" "2" "3"