2

At work, I was encountering a problem where users of our application were receiving messages featuring an invalid unicode character (0xffff), which according to the standard, should never be mapped to a symbol.

As a quick work aound I did the following:

badStr.replace(/\uffff/g, " ");

Which works as expected, and lets the user continue using the application until we find a better solution.

However, while I was playing around with this, I randomly tried a string replacement of "$$$$" which somehow got collapsed "$$".

You can see for yourself. Try pasting the following lines in your browser url bar:

javascript: alert(String.fromCharCode(0xffff).replace(/\uffff/g, "@@@@"));

results in @@@@

but

javascript: alert(String.fromCharCode(0xffff).replace(/\uffff/g, "$$$$"));

results in $$

This actually seems to be a problem with any string replacement, with $$$$ as the string replacement.

Both:

javascript: alert(String.fromCharCode(0x1234).replace(/\u1234/g, "$$$$"));
javascript: alert("hella".replace("h", "$$$$")); 

result in the $$ collapse.

Any ideas as to why the string replacement behaves this way?

Gopherkhan
  • 4,317
  • 4
  • 32
  • 54
  • I do not know an answer to your question, but where does this character come from in the first place? How does it make it into your messaging system? – Pekka Aug 26 '11 at 19:56
  • Executive emails. I'm guessing they're cutting and pasting things, with multiple utf encodings, and somehow they're ending up with this. – Gopherkhan Aug 26 '11 at 20:02

3 Answers3

7

That's because $ in the replace string has special meaning (group expansion). Have a look at this example:

alert("foo".replace(/(.*)/, "a$1b"));

That's why $$ is interpreted as $, for the case where you would need to actually replace something by $1 (literally, without group expansion):

alert("foo".replace(/(.*)/, "a$$1b"));

See also https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/String/replace#Specifying_a_string_as_a_parameter.

Wladimir Palant
  • 56,865
  • 12
  • 98
  • 126
2

The $ sign is a special character in the replacement argument to denote sub-matches from parentheses in the regex pattern ($1, $2, etc.). So to get a $ you have to "escape" it by typing $$. Which is what you did twice.

pulsar
  • 560
  • 2
  • 13
1

The $ in a replace string is used to signal the use of the match groups $1, $2 etc, si to put a $ into the replace string you need to use two of them.

HBP
  • 15,685
  • 6
  • 28
  • 34