0

I have loaded a list of 80+ text documents. In each document, there are multiple places where I want to pull information in the following pattern: \\\text\\\. Here I want "text", but do not know how to handle the "\" character in R.

For example, if I try to load a small example as a test I get the following error:

string <- "\\\Ok; front of house rude!\\\"
Error: '\O' is an unrecognized escape in character string starting ""\\\O"

If I change string to "\\\\OK; front of house!\\\\" then I can proceed with a test example, but remember in the loaded text docs, the format is \\\text\\\.

I'm trying to grab the text in between and I get the following error:

str_extract_all(string, "(?<=\\).*(?=\\)")

Error in stri_extract_all_regex(string, pattern, simplify = simplify,  : 
Incorrectly nested parentheses in regexp pattern. (U_REGEX_MISMATCHED_PAREN)

Just to show that the lookbehind-lookahead combo works:

str_extract_all(string, "(?<=\\;).*(?=\\!)")

[[1]]
[1] " front of house rude"

EDIT:

Again, take the following string and apply str_replace_all:

string <- "\\\\Ok; front of house rude!\\\\"
str_replace_all(string, "\\+", "REPLACE_ME")

# returns the original string rather than replacing the pattern
[1] "\\\\Ok; front of house rude!\\\\"
Ryan Erwin
  • 807
  • 1
  • 11
  • 30

0 Answers0