11

We have been using the following js/regex to find and replace all non-alphanumeric characters apart from - and +

outputString = outputString.replace(/[^\w|^\+|^-]*/g, "");

However it doesn't work entirely - it doesn't replace the ^ and | characters. I can't help but wonder if this is something to do with the ^ and | being used as meta-characters in the regex itself.

I've tried switching to use [\W|^+|^-], but that replaces the - and +. I thought that possibly a lookahead assertion may be the answer, but I'm not very sure how to implement them.

Has anyone got an idea how to accomplish this?

Phil Baines
  • 205
  • 4
  • 10

1 Answers1

16

Character classes do not do alternation, hence why the | is literal, and the ^ must be at the start of the class to take effect (otherwise it's treated literally.)

Use this:

[^\w+-]+

(Also, if - is not last, it needs to be escaped as \- inside a character class - so be careful if more characters might be added to the exception list).

You could also do it with a negative lookahead like this:

(?![+-])\W

Note: You do not want a * or + after that \W, since the lookahead only applies to the immediately following character (and the g flag makes the replace repeat until done).

Also note that \w and \W consider _ as a word character. If that's not desired then to replace that you can use (?![+-])[\W_] (or use explicit ranges in the first expressions).

Peter Boughton
  • 110,170
  • 32
  • 120
  • 176
  • Hi Peter, thanks - that's great. [^\w+-]+ worked just great! And thanks for the additional information - very helpful. – Phil Baines Jul 05 '10 at 09:17