2

In order to test if a pattern appears in a string, I found this function (R) :

grepl(pattern,string)

Now I want to specify more characteristics for the pattern:

  • OR: it is possible to do "pattern 1|pattern 2"
  • AND: is it possible to test if "pattern 1&pattern 2" both appear ? I tested, this expression doesn't work
  • also, what if I want "(a|b)&c", etc.

Exemple:

grepl("t","test") # returns TRUE OK
grepl("t|i","test") # returns TRUE OK
grepl("t&e","test") # I want to test if "t" and "e" are both in "test", which is TRUE
John Smith
  • 1,604
  • 4
  • 18
  • 45
  • 3
    [Regular Expressions: Is there an AND operator?](http://stackoverflow.com/questions/469913/regular-expressions-is-there-an-and-operator) From the answers there, you can piece together `grepl("(?=.*t)(?=.*e)", "test", perl = TRUE)` – Jota Jun 05 '16 at 02:30
  • You don't really need an `&` operator because you can just pass two tokens, which will both be matched: `grepl(".*t.*e.*", "test")` or if you want both orders, `grepl(".*(t.*e)|(e.*t).*", "test")`. – alistaire Jun 05 '16 at 06:11

2 Answers2

4

If you only have few 'patterns' (e.g., "t" and "e"); you can test whether all of them are in a string (e.g., "test") by simply doing this.

grepl("t","test") & grepl("e","test")#TRUE

The function 'str_detect' in the package 'stringr' does the same thing.

library('stringr')    
str_detect("test", "t") & str_detect("test", "e")#TRUE 

You could also write your own function, which could be convenient if you have many patterns. You can do this in many different ways; this is one example.

library(stringr)

all_in <- function(string, patterns){
  res1 <- NULL
  for (i in 1:length(patterns)){
    res1 <- rbind(res1, str_detect(string, patterns[i]))
  }
  res2 <- NULL
  for (i in 1:NCOL(res1)){
    res2 <- c(res2, all(res1[,i]))
  }
  res2
}

#test which elements of vector 'a' contain all elements in 'b'
a <- c("tea", "sugar", "peas", "tomato", "potatoe", "parsley", "tangelo")
b <- c("a", "e", "o", "t")
all_in(a,b)#FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE
milan
  • 4,782
  • 2
  • 21
  • 39
0

If you have many patterns to test you can create a value to hold all the alternatives and then use a grepl to search for any of them. This can make many variations more readable.

alt_strings <- c("t|e|z|123|friday|hello")
grepl(alt_strings,"test") #TRUE
grepl(alt_strings,"love fridays") #TRUE
grepl(alt_strings,"mondays sucks") #FALSE

If you have a long list of logical AND and OR statements to combine you can use addition and multiplication of the logical vectors resulting from grepl to keep track of things.

#Equivalent to 4 OR statements
grepl(pattern,x) + grepl(pattern,x) + grepl(pattern,x) + grepl(pattern,x) >= 1

#Equivalent to 4 AND statements
grepl(pattern,x) * grepl(pattern,x) * grepl(pattern,x) * grepl(pattern,x) >= 1

#Mix of 3 OR and 1 AND
(grepl(pattern,x) + grepl(pattern,x) + grepl(pattern,x)) * grepl(pattern,x) >= 1
andrea
  • 117
  • 10