This answer is a supplemental to hanshenrik's answer, as I liked the general solution, but found the example function to be hard to read and not optimal regarding its results. It does it's job perfectly fine nonetheless.
About XPath quoting
XPath 1.0 allows any characters inside their literals except the quotes used to quote the literal. Allowed quotes are "
and '
, so quoting literals that contain at most one of those quotes is trivial. But to quote string with both you need to quote them in different strings and concatenate them with XPath's concat()
:
He's telling you "Hello world!".
would need to be escaped like
concat("He's telling", ' you "Hello world!".')
It is of course irrelevant where in between the '
and "
you split the literal.
Differences of Implementations
hanshenrik's implementation creates the quoted literal by extracting all parts that aren't double quotes and then inserting quoted double quotes. But that can produce undesirable results:
"""x'x"x""xx
would be escaped by their function like
concat('"', '"', '"', "x'x", '"', "x", '"', '"', "xx")
and the example from above:
concat("He's telling you ", '"', "Hello world!", '"', ".")
This implementation on the other side minimizes the amount of partial literals by alternating the quote and then quoting as much as possible:
for the first example:
concat("He's telling you ", '"Hello world!".')
and for the second example:
concat('"""x', "'x", '"x""xx')
Implementation
/**
* Creates a properly quoted xpath 1.0 string literal. It prefers double quotes over
* single quotes. If both kinds of quotes are used in the literal then it will create a
* compound expression with concat(), using as few partial strings as possible.
*
* Based on {@link https://stackoverflow.com/a/54436185/6229450 hanshenrik's StackOverflow answer}.
*
* @param string $literal unquoted literal to use in xpath expression
* @return string quoted xpath literal for xpath 1.0
*/
public static function quoteXPathLiteral(string $literal): string
{
$firstDoubleQuote = strpos($literal, '"');
if ($firstDoubleQuote === false) {
return '"' . $literal . '"';
}
$firstSingleQuote = strpos($literal, '\'');
if ($firstSingleQuote === false) {
return '\'' . $literal . '\'';
}
$currentQuote = $firstDoubleQuote > $firstSingleQuote ? '"' : '\'';
$quoted = [];
$lastCut = 0;
// cut into largest possible parts that contain exactly one kind of quote
while (($nextCut = strpos($literal, $currentQuote, $lastCut))) {
$quotablePart = substr($literal, $lastCut, $nextCut - $lastCut);
$quoted[] = $currentQuote . $quotablePart . $currentQuote;
$currentQuote = $currentQuote === '"' ? '\'' : '"'; // toggle quote
$lastCut = $nextCut;
}
$quoted[] = $currentQuote . substr($literal, $lastCut) . $currentQuote;
return 'concat(' . implode(',', $quoted) . ')';
}