I have a dataframe consisting of 2 variables. Both can take only the values 1 or 0 so that there are only 4 possible combinations (groups). I want to seperate the groups from each other. My idea was to generate with expand.grid all possible combinations and compare each combination with the dataframe. Since this must be done a couple of times I want to use lapply. For this reason I created one list with the dataframe as its only element and a second list with one element for each of the 4 possible combinations.
set.seed(1)
cbind(sample(1:2, 10, replace = TRUE),sample(1:2, 10, replace = TRUE))->pred
data.frame(pred)->pred
list(pred)->pred
expand.grid(1:2,1:2)->groups
lapply(as.list(data.frame(t(groups))),t)->groups
The data:
pred
X1 X2
1 1 1
2 1 1
3 2 2
4 2 1
5 1 2
6 2 1
7 2 2
8 2 2
9 2 1
10 1 2
groups
$X1
[,1] [,2]
[1,] 1 1
$X2
[,1] [,2]
[1,] 2 1
$X3
[,1] [,2]
[1,] 1 2
$X4
[,1] [,2]
[1,] 2 2
Here the thing that puzzles me:
pred[[1]]==groups[[1]]
X1 X2
[1,] TRUE TRUE
[2,] TRUE TRUE
[3,] FALSE FALSE
[4,] FALSE TRUE
[5,] TRUE FALSE
[6,] FALSE TRUE
[7,] FALSE FALSE
[8,] FALSE FALSE
[9,] FALSE TRUE
[10,] TRUE FALSE
pred[[1]]==groups[[2]]
X1 X2
[1,] FALSE FALSE
[2,] TRUE TRUE
[3,] TRUE TRUE
[4,] FALSE TRUE
[5,] FALSE TRUE
[6,] FALSE TRUE
[7,] TRUE TRUE
[8,] FALSE FALSE
[9,] TRUE FALSE
[10,] TRUE FALSE
In the first case it worked and in the second case it did not. What is wrong with the code and is there possibly a better solution for my problem?