1

Given a string i want to extract jsons within that string.

Very similar to this Question: Find JSON strings in a string string. Just for R.

Basically, i Need to take the regex and Escape characters if necessary. Therefore, i looked into: Is there an R function to escape a string for regex characters.

What i tried:

txt <- "asdd {a:b, c:d} asdasd"
library(stringr)
quotemeta <- function(string) {
  str_replace_all(string, "(\\W)", "\\\\\\1")
}

quotemeta("\{(?:[^{}]|(?R))*\}")
str_extract_all(string = txt, pattern = quotemeta("\\{(?:[^{}]|(?R))*\\}"))
str_extract_all(string = txt, pattern = "\\{\\(\\?\\:\\[\\^\\{\\}\\]\\|\\(\\?R\\)\\)\\*\\}")
str_extract_all(string = txt, pattern = "\\\\{\\(\\?\\:\\[\\^\\{\\}\\]\\|\\(\\?R\\)\\)\\*\\\\}")
Tlatwork
  • 1,445
  • 12
  • 35

1 Answers1

2

I use regexpr() and regmatches().

  • regexpr(pattern,text) : Take the position of text which match the pattern.
  • regmatches(m,x) : Extract matched text.
  • pattern : Turn \{ \} into \\{ \\}.
regexpr("\\{(?:[^{}]|(?R))*\\}",txt,perl = T) %>% regmatches(x=txt)
#[1] "{a:b, c:d}"

This pattern may be easier for understanding.

  • This pattern is \\{(\\S|\\s)+\\} :
    • \\{ means the curly bracket "{"
    • (\\S|\\s)+ means all whitespace characters and non-whitespace characters between curly brackets.
    • \\} means the curly bracket "}"
regexpr("\\{(\\S|\\s)+\\}",txt,perl = T) %>% regmatches(x=txt)
#[1] "{a:b, c:d}"

Hope it is useful to you :)

Hsiang Yun Chan
  • 141
  • 2
  • 4
  • great answer and welcome to #SO! Probably, more a note to myself and not asked in the spec,.....(but Maybe someone else Needs it: If there are multiple jsons just use `gregexpr()` instead of `regexpr()`. – Tlatwork Dec 12 '19 at 14:52
  • 1
    I'm so happy to see your feedback. Thank you so much =) – Hsiang Yun Chan Dec 17 '19 at 10:00