2

I am after a way (preferably an existing function) that can locate legitimate JSON data within a character string

What I know already

As shown here, jsonlite::fromJSON() can parse JSON, like so:

library(jsonlite)

json_glob_1 <- "{ \"age\": 22}"
json_glob_2 <- "{ \"name\":\"John\" }"

fromJSON(json_glob_1)
# $age
# [1] 22

fromJSON(json_glob_2)
# $name
# [1] "John"

What I do not know

Is there a function that can accept an impure string and return the JSON glob(s) from within that string; e.g.

messy_string_with_json <- paste0("lsdfjksdlfjk dkfjsldfkjs fkjsdf", 
                                               json_glob_1, 
                                               "slkdfjlskdfj sfkdjflskdjf sdfk", 
                                               json_glob_2, 
                                               "32345jlskdfj")

find_JSON(messy_string_with_json)
[[1]]
[1] "{ \"age\": 22}" 
[2] "{ \"name\":\"John\" }"
stevec
  • 41,291
  • 27
  • 223
  • 311

1 Answers1

0

I'm not sure if one exists out-the-box, but you can write one.

Here I'm using regex to find all values between braces

Then calling jsonlite::validate on the results to see if it's valid or not.

library(jsonlite)

json_glob_1 <- "{ \"age\": 22}"
json_glob_2 <- "{ \"name\":\"John\" }"

x <- paste0(
    "lsdfjksdlfjk dkfjsldfkjs fkjsdf"
    , json_glob_1
    , "slkdfjlskdfj sfkdjflskdjf sdfk"
    , json_glob_2
    , "32345jlskdfj"
    )


## try and find values between braces
possible <- regmatches(x, gregexpr("(?=\\{).*?(?<=\\})", x, perl=T))[[1]]

## then try and valide them as JSON
sapply( possible, jsonlite::validate )

#     { "age": 22} { "name":"John" } 
#             TRUE              TRUE 
SymbolixAU
  • 25,502
  • 4
  • 67
  • 139
  • Thanks for the suggestion. It seems to work on small JSON globs, but consider `x <- paste0( "lsdfjksdlfjk dkfjsldfkjs fkjsdf" , toJSON(iris) , "slkdfjlskdfj sfkdjflskdjf sdfk" , json_glob_2 , "32345jlskdfj" )`; it will return 151 globs instead of just 2 – stevec Feb 03 '19 at 08:12
  • You can get the whole array with `possible <- regmatches(x, gregexpr("(?=\\[).*?(?<=\\])", x, perl=T))[[1]]` – SymbolixAU Feb 03 '19 at 10:16