1

Is anyone aware of any R package that supports complex boolean + wildcard text matching implemented in some web interfaces - i.e. using AND OR * ? ()/{} operators? For example a function that can handle queries in the following formats (all examples return TRUE):

s = "The quick brown fox jumps over the lazy dog"
boolean_match(s, "fox AND dog")
boolean_match(s, "fox OR bird")
boolean_match(s, "f?x")
boolean_match(s, "The quick * dog")
boolean_match(s, "lazy AND (fox OR bird)") # i.e. nested logic
boolean_match(s, "(pretty AND bird) OR (quick AND (fox OR squirrel))") # recursive

This question asks the same but has not received much interest. I'm aware of the stringi package - e.g. stri_detect_regex(s, c("fox","dog")), potentially combined with all()/any() - but it seems unable to handle the nested logic. Attempting conversion of this sort of complex query structure to REGEX seems suicidal.

Any suggestions much appreciated.

geotheory
  • 22,624
  • 29
  • 119
  • 196
  • Would using a structure such as `if(grepl("fox", s) == TRUE & grepl("dog", s) == TRUE)` or `if(grepl("fox", s) == TRUE | grepl("dog", s) == TRUE)` work for what you are attempting to accomplish? – Matt Jewett May 23 '17 at 13:04
  • Restructuring you statements using `any()` / `all()` like this would provide nested evaluation `all(stri_detect_regex(s,"lazy"), all(stri_detect_regex(s,"quick"), any(stri_detect_regex(s,"fox"), stri_detect_regex(s,"dog"))))` – Matt Jewett May 23 '17 at 13:18
  • Thanks Matt. I'm looking for a general solution, i.e. not one where I have to string these functions together. Let's see if anyone else has a suggestion. – geotheory May 23 '17 at 13:32

1 Answers1

0

I've written something myself - see https://github.com/geotheory/booleanR

geotheory
  • 22,624
  • 29
  • 119
  • 196