I have a string in the format of an url query :
string <- "key1=value1&key2=value2"
And I would like to extract all the parameters names (key1
, key2
).
I thought about strsplit
with a split matching everything between =
and an optional &
.
unlist(strsplit(string, "=.+&?"))
[1] "key1"
But I guess that this pattern matches from the first =
to the end of the string including my optional &
in the .+
. I suspect this is because of the "greediness" of the regexp so I tried it to make lazy but I got a strange result.
> unlist(strsplit(string, "=.+?&?"))
[1] "key1" "alue1&key2" "alue2"
Now I don't really understand what is happening here and I don't know how I can make it lazy when the last matching character is optional.
I know (and I think I also understand why) that it works if I excludes &
from .+
but I wish I could understand why the regexp above aren't working.
> unlist(strsplit(string, "=[^&]+&?"))
[1] "key1" "key2"
My actual option is to do it in 2 times with :
unlist(sapply(unlist(strsplit(string, "&")), strsplit, split = "=.*", USE.NAMES = FALSE))
What I'm doing wrong to achieve this in one regexp ? Thanks for any help.
I'm painfully learning regexp, so any other options would be also appreciated for my knowledge !