0

I have a vector called

vec <- c("16S_s95_S112_R2_101.fastq.gz",
         "16S_s95_S112_R1_001.fastq.gz",
         "16S_s94_S103_R2_021.fastq.gz",
         "16S_s94_S103_R1_001.fastq.gz")

I want to grepl items with sample <- "_s95_" and R1 <- "R1".

I want to use sample and R1 objects while doing grepl and find something matching _s95_ and R1 strings both.

Result I want is 16S_s95_S112_R1_001.fastq.gz. I tried grepl(pattern = sample&R1, x= vec) which did not work for me.

I can do this with multiple grepl's, but I am trying to find something neat to do this.

MAPK
  • 5,635
  • 4
  • 37
  • 88

2 Answers2

1

You need to work a bit more in your pattern in order to get the match, try:

> grep(paste0(".*", sample, ".*", R1), vec, value=TRUE)
[1] "16S_s95_S112_R1_001.fastq.gz"
Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138
1

For your specific use case where you know the order of the patterns, it's almost certainly going to be faster to follow Jilber Urbina's suggestion to programmatically compose a single regex.

For a more general solution that works regardless of order and on any number of patterns, we can use sapply to loop across each pattern, and then use rowSums to count the number of pattern matches and find the rows where all of them match:

patterns = c("_s95_", 'R1')

sapply(patterns, function(x) grepl(x, vec))
     _s95_    R1
[1,]  TRUE FALSE
[2,]  TRUE  TRUE
[3,] FALSE FALSE
[4,] FALSE  TRUE

vec[which(rowSums(sapply(patterns, function(x) grepl(x, vec))) == length(patterns))]

[1] "16S_s95_S112_R1_001.fastq.gz"
divibisan
  • 11,659
  • 11
  • 40
  • 58