Guvante has provided a solution to this problem.
However, is the behavior well-defined according to the specification?
To start from the conclusion. Yes, the solution has well-defined behavior.
C++ specification
The documentation of format_default
, which specifies ECMA rules to interpret the format string, points to Section 15.5.4.11 of ECMA-262.
ECMA-262 specification
According to Table 22 in Section 15.5.4.11 of ECMA-262 specification
$n
The nth capture, where n is a single digit in the range 1 to 9 and $n
is not followed by a decimal digit. If n ≤ m and the nth capture is undefined, use the empty String instead. If n > m, the result is implementation-defined.
$nn
The nnth capture, where nn is a two-digit decimal number in the range 01 to 99. If nn ≤ m and the nnth capture is undefined, use the empty String instead. If nn > m, the result is implementation-defined.
The variable m is defined in previous paragraph in the same section:
[...] Let m be the number of left capturing parentheses in searchValue
(using NcapturingParens
as specified in 15.10.2.1).
Replacement string in the question "$10xNUM"
Back at the code in the question:
cout << regex_replace( "foobar0x1", regex_a, "$10xNUM" ) << endl;
Since $1
is followed by 0
, it has to be interpreted as the second rule $nn
, as the first rule forbids any digit to follow $n
. However, since the pattern only has 2 capturing groups (m = 2) and 10 > 2, the behavior is implementation-defined according to the specification.
We can see the effect of the implementation-defined clause by comparing the result of functionally equivalent JavaScript code in Firefox 37.0.1:
> "foobar0x1".replace(/(.*)bar(.*)/g, "$10xNUM" )
< "foo0xNUM"
As you can see, Firefox decided to interpret $10
as taking the value of the first capturing group $1
, then followed by the fixed string 0
. This is a valid implementation according to the specification, under the condition in $nn
clause.
Replacement string in Guvante's answer: "$010xNUM"
Same as above, $nn
clause is used, since $n
clause forbids any digit to follow. Since 01 in $01
is less than the number of capturing groups (m = 2), the behavior is well-defined, which is to use the content of capturing group 1 in the replacement.
Therefore, Guvante's answer will return the same result on any complaint C++ compiler.