1

Could anyone please help me and explain how can I extract a string from a character vector that contains special characters in it?

I'm working with a vector like this:

txt <- c("{\"label\":\"Describes me best\",\"multiplier\":1}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", 
"{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", 
"{\"label\":\"Describes me best\",\"multiplier\":1}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", 
"{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", "{\"label\":\"Describes me best\",\"multiplier\":1}", 
"{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}"
)

> txt
 [1] "{\"label\":\"Describes me best\",\"multiplier\":1}"       "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}"
 [3] "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}" "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}"
 [5] "{\"label\":\"Describes me best\",\"multiplier\":1}"       "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}"
 [7] "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}" "{\"label\":\"Describes me best\",\"multiplier\":1}"      
 [9] "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}" "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}"

I'd like to extract only the Describes me best and Somewhat describes me parts dropping the rest.

I was trying to adapt the str_match() solution as presented here https://stackoverflow.com/a/39086448/6925293, but probably due to multiple special characters {\" etc, I can't make it work.

blazej
  • 1,678
  • 3
  • 19
  • 41

2 Answers2

4

Since these are JSON strings, you can use the jsonStrings package:

library(jsonStrings)

x <- "{\"label\":\"Describes me best\",\"multiplier\":1}"
jstring <- jsonString$new(x)
jstring$at("label")
# "Describes me best"
Stéphane Laurent
  • 75,186
  • 15
  • 119
  • 225
  • 1
    How would you use jsonString() with a character vector. Do you need a loop/map, right? Something like: 1:length(txt) |> purrr::map(~{ jstring <- jsonString$new(txt[.x]) jstring$at("label") }) – Gorka Sep 26 '22 at 10:56
  • 1
    @Gorka Yes, or sapply/vapply – Stéphane Laurent Sep 26 '22 at 12:46
0

Is this what you need?

  txt <- c("{\"label\":\"Describes me best\",\"multiplier\":1}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", 
           "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", 
           "{\"label\":\"Describes me best\",\"multiplier\":1}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", 
           "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", "{\"label\":\"Describes me best\",\"multiplier\":1}", 
           "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}", "{\"label\":\"Somewhat describes me\",\"multiplier\":0.5}"
  )
  
  
# With gsub you can catch between () a pattern, and get it with \\1
  gsub(pattern = '.*"(.*)",.*', replacement = "\\1", x = txt)

#>  [1] "Describes me best"     "Somewhat describes me" "Somewhat describes me"
#>  [4] "Somewhat describes me" "Describes me best"     "Somewhat describes me"
#>  [7] "Somewhat describes me" "Describes me best"     "Somewhat describes me"
#> [10] "Somewhat describes me"

Created on 2022-09-26 with reprex v2.0.2

Gorka
  • 3,555
  • 1
  • 31
  • 37
  • 1
    While this solution does what I asked for, the `jsonStrings()` one gives me more flexibility and I'll accept that one. Still, thank you for the `gsub()` - I'll use it elsewhere. – blazej Sep 26 '22 at 11:03
  • No worries. Thanks for the comment. I agree the jsonStrings is better when dealing with JSON strings. I tend to use gsub when the input is not always well formed JSON. Cheers. – Gorka Sep 26 '22 at 11:13