0

I'm trying to extract all elements of a string except for a designated substring. I would like to extract everything except the words select and from, and everything in between. I can extract the substring, but I can't figure out how to extract everything except the substring.

a <- "10 bananas select green apples from fruit where (select pears from apples order by fruit)"

#I can successfully extract the substrings using the following code, but I'm looking for the opposite: 
str_extract_all(a, "select.*?from")

#expected output
a<-"10 bananas fruit where ( apples order by fruit)"

Aimers513
  • 31
  • 5
  • 2
    Just replace that pattern with `""`. You're just trying to remove a substring. Use `gsub`, or `stringr::str_remove_all` has that built in as a replacement string – camille Jan 28 '20 at 16:57
  • Thank you! I also need to find the location of each of the elements I'll be keeping from the original string. Will I need a regular expression for that piece? – Aimers513 Jan 28 '20 at 17:26
  • If you're still looking for the same pattern, then yes. That's a slightly different question, but `regexpr` or `stringr::str_locate_all` will both do that. That question is probably also covered on other SO posts if you look around – camille Jan 28 '20 at 17:30
  • I need the locations of everything that isn't "select.*?from". I tried str_locate_all(a, "[select.*?from]" but it evaluates the expression as individual characters instead of one string. – Aimers513 Jan 28 '20 at 17:33
  • Yeah, because by wrapping the pattern in `[]` you've changed the pattern to mean any of the enclosed characters. I don't totally get what you're looking for but it sounds like a separate question. If you can't find SO posts that handle that, post another question (but please look around SO first) – camille Jan 28 '20 at 17:40

1 Answers1

0

We can use str_remove

str_remove_all(a, "select.*?from")
#[1] "10 bananas  fruit where ( apples order by fruit)"

str_extract extracts the substring based on the pattern. Here, we need to remove the pattern substring from the string and return a single string

akrun
  • 874,273
  • 37
  • 540
  • 662