5

I have a string myFunction(arg1=\"hop\",arg2=TRUE). I want to isolate what is in between quotes (\"hop\" in this example)

I have tried so far with no success:

gsub(pattern="(myFunction)(\\({1}))(.*)(\\\"{1}.*\\\"{1})(.*)(\\){1})",replacement="//4",x="myFunction(arg1=\"hop\",arg2=TRUE)")

Any help by a regex guru would be welcome!

RockScience
  • 17,932
  • 26
  • 89
  • 125

4 Answers4

10

Try

 sub('[^\"]+\"([^\"]+).*', '\\1', x)
 #[1] "hop"

Or

 sub('[^\"]+(\"[^\"]+.).*', '\\1', x)
 #[1] "\"hop\""

The \" is not needed as " would work too

 sub('[^"]*("[^"]*.).*', '\\1', x)
 #[1] "\"hop\""

If there are multiple matches, as @AvinashRaj mentioned in his post, sub may not be that useful. An option using stringi would be

 library(stringi)
 stri_extract_all_regex(x1, '"[^"]*"')[[1]]
 #[1] "\"hop\""  "\"hop2\""

data

 x <- "myFunction(arg1=\"hop\",arg2=TRUE)"
 x1 <- "myFunction(arg1=\"hop\",arg2=TRUE arg3=\"hop2\", arg4=TRUE)"
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    many thanks, this works great. Could you explain the rationale for the first solution? – RockScience Apr 08 '15 at 08:09
  • 1
    @RockScience The first solution matches all characters that are not `\"` i.e `[^\"]+`, followed by a `\"`, and then use capture groups (within parentheses) to get the characters that not `\"`, use `\\1` to extract the capture group. – akrun Apr 08 '15 at 08:11
8

You could use regmatches function also. Sub or gsub only works for a particular input , for general case you must do grabing instead of removing.

> x <- "myFunction(arg1=\"hop\",arg2=TRUE)"
> regmatches(x, gregexpr('"[^"]*"', x))[[1]]
[1] "\"hop\""

To get only the text inside quotes then pass the result of above function to a gsub function which helps to remove the quotes.

> x <- "myFunction(arg1=\"hop\",arg2=TRUE)"
> gsub('"', '', regmatches(x, gregexpr('"([^"]*)"', x))[[1]])
[1] "hop"
> x <- "myFunction(arg1=\"hop\",arg2=\"TRUE\")"
> gsub('"', '', regmatches(x, gregexpr('"([^"]*)"', x))[[1]])
[1] "hop"  "TRUE"
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
3

You can try:

str='myFunction(arg1=\"hop\",arg2=TRUE)'

gsub('.*(\\".*\\").*','\\1',str)
#[1] "\"hop\""
Colonel Beauvel
  • 30,423
  • 11
  • 47
  • 87
2
x <- "myFunction(arg1=\"hop\",arg2=TRUE)"
unlist(strsplit(x,'"'))[2]
# [1] "hop"
pogibas
  • 27,303
  • 19
  • 84
  • 117
  • 1
    with `paste0("\"",unlist(strsplit(x,'\"',perl=T))[2],"\"")` to get the desired result... (check the comments after the OP's question) – Cath Apr 08 '15 at 08:21