-1

I have a df with two columns containing strings with false negative and false positive calls. I'd like to compare the two columns and identify the ones that are both "FN" and "FP" and make a third column with a "tag" indicating whether the columns meet the specs of the query.

For example here's a piece of the df

x1           x2
1/2:FN:am    .:.:.
1|1:FN:am    0/1:FP:am
.:.:.        1|0:559.511:FP

I'd like the resulting output to be

x1           x2               x3
1/2:FN:am    .:.:.            False
1|1:FN:am    0/1:FP:am        True
.:.:.        1|0:559.511:FP   False 

Thanks!

Stephen Williams
  • 561
  • 4
  • 12

2 Answers2

3

Does this give you what you need?

df <- data.frame(x1=c("1:FN:AM","1.2:FN:AM","3"),x2=c("1:AM","1.2:FP:AM","3"),stringsAsFactors = FALSE)
         x1        x2
1   1:FN:AM      1:AM
2 1.2:FN:AM 1.2:FP:AM
3         3         3

df$x3 <- sapply(df$x1,grepl,pattern = "FN") & sapply(df$x2,grepl,pattern = "FP")
         x1        x2    x3
1   1:FN:AM      1:AM FALSE
2 1.2:FN:AM 1.2:FP:AM  TRUE
3         3         3 FALSE
PhilC
  • 767
  • 3
  • 8
1

This also works (this captures any of the patterns (FP in x1 and FN in x2) or (FN in x1 and FP in x2))

df <- read.table(text='x1           x2
                 1/2:FN:am    .:.:.
                 1|1:FN:am    0/1:FP:am
                 1|0:55:FP    0/2:FN:am
                 .:.:.        1|0:559.511:FP', header=TRUE, stringsAsFactors=FALSE)
df$x3 <- grepl('.*FN.*FP.*|.*FP.*FN.*', paste(df$x1, df$x2))
df
#         x1             x2    x3
#1 1/2:FN:am          .:.:. FALSE
#2 1|1:FN:am      0/1:FP:am  TRUE
#3 1|0:55:FP      0/2:FN:am  TRUE
#4     .:.:. 1|0:559.511:FP FALSE
Sandipan Dey
  • 21,482
  • 2
  • 51
  • 63