1

I'm learning how data tables work and I'm trying to use grep() on two columns (id1 and id2) to delete rows that don't return TRUE.

I know I have to use the function lapply() but it always returns the followed error :

argument 'pattern' has length > 1 and only the first element will be used

I tried this (and I know it's wrong) :

DT[, lapply(.SD, grepl(id1, id2)), by= id]

The data I'm working on :

structure(list(id = c(52L, 52L, 52L, 52L, 54L, 54L, 84L, 84L, 
87L, 87L, 129L, 129L, 130L, 130L, 130L), id1 = c("8113H187", 
"3505H6", "3505H6", "3505H6", "3505H6", "3505H6", "3505H6", "3505H6", 
"8113H187", "8113H187", "3505H6", "3505H6", "3505H6", "3505H6", 
"3505H6"), id2 = c("3505H6856", "3505H6856", "3505H6856", "3505H6856", 
"3505H67158", "3505H67158", "3505H63188", "3505H63188", "3505H64691", 
"3505H64691", "3505H664133", "3505H664133", "3505H658134", "3505H658134", 
"3505H658134")), .Names = c("id", "id1", "id2"), row.names = c(NA, 
-15L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x00000000064f0788>)
Cyrus
  • 84,225
  • 14
  • 89
  • 153
Mbr Mbr
  • 734
  • 6
  • 23

2 Answers2

1

We can use Map to do compare the corresponding elements in 'id1' as pattern to the elements in 'ID2'

DT[unlist(Map(grepl, id1, id2))]
akrun
  • 874,273
  • 37
  • 540
  • 662
1
DT[mapply( grepl, id1, id2), ]

#     id    id1         id2
# 1:  52 3505H6   3505H6856
# 2:  52 3505H6   3505H6856
# 3:  52 3505H6   3505H6856
# 4:  54 3505H6  3505H67158
# 5:  54 3505H6  3505H67158
# 6:  84 3505H6  3505H63188
# 7:  84 3505H6  3505H63188
# 8: 129 3505H6 3505H664133
# 9: 129 3505H6 3505H664133
# 10: 130 3505H6 3505H658134
# 11: 130 3505H6 3505H658134
# 12: 130 3505H6 3505H658134
Sathish
  • 12,453
  • 3
  • 41
  • 59