1

I would like to check if the names in column "Pred1" and "Pred2" are identical to the names in column "Expected" for the same row. If the names are identical it returns TRUE, else it return FALSE. I tried the identical() function, but I am not sure how to do this for each cell.

in

Expected        Pred1           Pred2
Bacteroides     Bacillus        Bacteroides
Bifidobacterium Bifidobacterium  Escherichia

out

Expected        Pred1         Pred2
Bacteroides      FALSE         TRUE
Bifidobacterium  TRUE          FALSE
user2300940
  • 2,355
  • 1
  • 22
  • 35
  • https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – jogo Sep 17 '19 at 08:10

3 Answers3

1

You could use outer.

fun <- Vectorize(function(x, y) identical(d[x, 1], d[x, y]))
cbind(d[1], Pred=outer(1:2, 2:3, fun))
#          Expected Pred.1 Pred.2
# 1     Bacteroides  FALSE   TRUE
# 2 Bifidobacterium   TRUE  FALSE

Or do it with ==.

sapply(1:2, function(x) d[x, 1] == d[x, 2:3])
#       [,1]  [,2]
# [1,] FALSE  TRUE
# [2,]  TRUE FALSE

Data

d <- structure(list(Expected = c("Bacteroides", "Bifidobacterium"), 
    Pred1 = c("Bacillus", "Bifidobacterium"), Pred2 = c("Bacteroides", 
    "Escherichia")), row.names = c(NA, -2L), class = "data.frame")
jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • Thanks. Where do you get the number 1: identical(d[x, 1] – user2300940 Jul 14 '19 at 09:28
  • @user2300940 Since notation is `d[, ] ` The `1` in `d[x, 1]` denotes the first column. You can check out what's happening in the `outer()` by doing `outer(1:2, 2:3, paste0)`. – jay.sf Jul 14 '19 at 09:32
1

Solution using a for loop:

l <- list()
for(i in 2:length(df)){
   l[[i]] <- df[,1] == df[,i]
}
df1 <- as.data.frame(do.call(cbind,l))

Data:

df <- data.frame(Expected = c("Bacteriodes","Bifidobacterium"),Pred1 = c("Bacillus","Bifidobacterium"),Pred2 = c("Bacteriodes","Escherichia"),stringsAsFactors = F)
fabla
  • 1,806
  • 1
  • 8
  • 20
1

lapply() will loop through all of the columns that you want to check. The function used == will check equivalent with the right hand side which would be d[, 'Expected'].

lapply(d[, c('Pred1', 'Pred2')], '==', d[, 'Expected'])
#equivalent to
lapply(d[, c('Pred1', 'Pred2')], function(x) x == d[, 'Expected'])

$Pred1
[1] FALSE  TRUE

$Pred2
[1]  TRUE FALSE

To get it into the right format, you can assign them back to the original columns. Note I made a copy but you can just as easily assign the results to the original data.frame.

d_copy <- d

d_copy[, c('Pred1', 'Pred2')] <- lapply(d[, c('Pred1', 'Pred2')], '==', d[, 'Expected'])

d_copy
         Expected Pred1 Pred2
1     Bacteroides FALSE  TRUE
2 Bifidobacterium  TRUE FALSE
Cole
  • 11,130
  • 1
  • 9
  • 24