I'm trying a regex lookahead in R with the following command:
sub(x = street.addresses, pattern = "\\s((?i)Street|(?i)St\\.?)(?=\\sNE)", replacement = " St")
My goal is to replace Street with St where it's followed by a space and the directional NE (as in, "Northeast"). It seems like the lookahead couldn't be more straightforward but I keep hitting an error:
Error in sub(x = streets, pattern = "\\s((?i)Street|(?i)St\\.?)(?=\\sNE)",:
invalid regular expression '\s((?i)Street|(?i)St\.?)(?=\sNE)', reason
'Invalid regexp'
Versions of this without the lookahead work fine in R, but as soon as I add a lookahead of any sort to my search/replace, I hit the error. Likewise, other regex R functions like grep seem to have the same problem.
I've copied/pasted that regex expression into engines like https://regex101.com/ and it seems to work fine there, so I'm confused. Am I missing something basic about regex in R?
EDIT:
Here's a copy direct from my console:
> street.addresses <- c("23 Charles Street NE","23 Charles St. NE")
> new.vec <- sub(x = street.addresses, pattern = "\\s((?i)Street|(?i)St\\.?)
(?=\\sNE)", replacement = " St")
Error in sub(x = street.addresses, pattern = "\\s((?i)Street|(?i)St\\.?)(?
=\\sNE)", :
invalid regular expression '\s((?i)Street|(?i)St\.?)(?=\sNE)', reason
'Invalid regexp'