I try to find out whether there is a letter in column V3
which occurs in each of two factor groups V1
and V2
. It will be clear what I mean with some data:
df <- structure(list(a = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 3L), b = c(4L, 5L, 5L, 6L, 6L, 5L, 6L, 6L, 6L,
6L, 4L, 4L, 5L, 5L, 5L), d = structure(c(3L, 3L, 3L, 2L, 3L,
2L, 1L, 4L, 2L, 3L, 4L, 1L, 1L, 4L, 3L), .Label = c("a", "b",
"c", "d"), class = "factor")), .Names = c("V1", "V2", "V3"), row.names = c(NA,
-15L), class = "data.frame")
df
V1 V2 V3
1 1 4 c
2 1 5 c
3 1 5 c
4 1 6 b
5 1 6 c
6 2 5 b
7 2 6 a
8 2 6 d
9 2 6 b
10 2 6 c
11 3 4 d
12 3 4 a
13 3 5 a
14 3 5 d
15 3 5 c
Thus, for the first group V1 == 1
, there are three levels of V2 = c(4, 5, 6)
and in each level there is a "c"
in V3
. My expected output would be then something like this, setting all "c"
to TRUE
and the "b"
in row 4 to FALSE, because it occurs not in all groups. For V1 == 2
we observe in V2
the two levels c(5, 6)
, and now the letter "b"
in all levels. Thus "b"
is here TRUE
and all others (c("a", "d", "c")
) not (FALSE
).
a b d e
1 1 4 c TRUE
2 1 5 c TRUE
3 1 5 c TRUE
4 1 6 b FALSE
5 1 6 c TRUE
6 2 5 b TRUE
7 2 6 a FALSE
8 2 6 d FALSE
9 2 6 b TRUE
10 2 6 c FALSE
11 3 4 d TRUE
12 3 4 a TRUE
13 3 5 a TRUE
14 3 5 d TRUE
15 3 5 c FALSE
Using split()
and table()
I am able to find the letters occuring in all factor levels of V2
and V1
.
a1 <- lapply(split(df, df$V1), function(x) names(which(apply(table(x$V3, x$V2) != 0, 1, all))))
a1
$`1`
[1] "c"
$`2`
[1] "b"
$`3`
[1] "a" "d"
Now I could split again the dataframe search for the letters and create the logical vector using something like this.
unlist(Map(function(x, y) x$V3 %in% y, split(df, df$V1), a1))
11 12 13 14 15 21 22 23 24 25 31 32 33 34 35
TRUE TRUE TRUE FALSE TRUE TRUE FALSE FALSE TRUE FALSE TRUE TRUE TRUE TRUE FALSE
But this is inconvenient and far away from an elegant solution. Therfore the question, which is IMO not a duplicate one.