This is related to a previous question, here: Converting a \u escaped Unicode string to ASCII
I proposed a solution involving eval(parse(text=x))
, which for non-R users, means what it says: parsing the text string, then evaluating it. The aim was not to allow arbitrary code to be executed, but only to un-escape escaped Unicode text. Hence the solution:
eval(parse(text=paste0("'", x, "'")))
While this should be fairly safe given the restricted objective, I'd be interested to know: how much sanitisation is required to keep things safe?
At a minimum, I guess any embedded single and double quotes have to be escaped. For example, suppose we have
x <- "this is a '; print(dir()); 'string"
Then eval
'ing this per the snippet above would execute the code in the middle. So we have to escape the quotes:
eval(parse(text=paste0("'",
gsub("'", "\\\\'", x),
"'")))
And similarly for double quotes. I don't think the unescaped Unicode equivalents \u0022
and \u0027
are a problem, since to the parser they'll be identical to plain "
and '
.
Are there any holes in this approach that I've missed?