I am trying to filter some strings in the data. For example I want to filter out 'AxxBy' strings but there is this string 'AxxByy' I want to keep! x and y stands for number of digits!
Here is what I tried,
data <- data.frame(pair=paste(paste('A',c(seq(1:4),10,11),sep=''),paste('B',c(2,3,4,22,33,44),sep=''),sep=''))
pair
1 A1B2
2 A2B3
3 A3B4
4 A4B22
5 A10B33
6 A11B44
I want to remove those pairs starting with A1 but not A10 and A11. Same as for also B2 but keep B22! etc.
x <- c(paste('A',1,sep=''), paste('B',2,sep='')) # filtering conditions
library(dplyr)
df <- data%>%
filter(!grepl(paste(x,collapse='|'),pair))
pair
1 A2B3
2 A3B4
In this post Filtering observations in dplyr in combination with grepl
it is possible to add line starting with "^x|xx$"
by regex functions but I haven't seen any post if the filtering conditions defined outside of the pipe.
Expected output
pair
1 A2B33
2 A3B4
3 A4B22
4 A10B33
6 A11B44
The thumb of rule is that; if there is two digits after 'A' put B so AxxB and !grepl everything for defined xx numbers in the x
input. if there is only 'B' and one digit which is 'By' is given !grepl 'By$' not 'Byy' inputs. Of course this includes 'AxBy$' and 'AxxBy$' that's all. I still cannot generalize @alistaire solution!