6

I'm trying to write a regex that will replace all invalid characters in a JavaScript variable name with underscores (in Java).

What I'm wanting to do is:

String jsVarName = "1inva>idName".replaceAll("[a-zA-Z_$][0-9a-zA-Z_$]", "_");

and end up with a variable named _inva_idName.

What I'm struggling to do is figure out how to make the first character different to the others.

[a-zA-Z_$][0-9a-zA-Z_$] are the characters I want, but I cant figure out to hook them into the correct syntax. I know JS var names can be full unicode, but I only care about about ASCII.

Justin
  • 24,288
  • 12
  • 92
  • 142
Kong
  • 8,792
  • 15
  • 68
  • 98
  • related: http://stackoverflow.com/questions/1661197/valid-characters-for-javascript-variable-names – zamnuts Oct 30 '14 at 03:37
  • since the title is somewhat confusing, note that a javascript variable name can contain far more charactesr than just `0-9a-zA-Z_$` – phil294 Aug 15 '21 at 15:21

1 Answers1

5
String jsVarName = "1inva>idName".replaceAll("^[^a-zA-Z_$]|[^0-9a-zA-Z_$]", "_");

Note that since \w is [a-zA-Z_0-9], it can be simplified:

String jsVarName = "1inva>idName".replaceAll("^[^a-zA-Z_$]|[^\\w$]", "_")

^[^a-zA-Z_$] matches anything that is not [a-zA-Z_$] and appears at the beginning of the line. | is OR. [^0-9a-zA-Z_$] matches anything that is not [0-9a-zA-Z_$].

See regex tutorial for more info.

Justin
  • 24,288
  • 12
  • 92
  • 142
  • Oh you were talking in relation with my answer, sorry I'm a little slow atm. I thought the character class after `|` would also match the beginning of the string without negating the `^` but apparently it works without it. – Fabrício Matté Nov 05 '13 at 05:18
  • 1
    Oh damn, I didn't realize the first group is contained in the second group. +1 your answer is pretty cleaner. – Fabrício Matté Nov 05 '13 at 05:20