0

I found many solutions online for regex matching the first occurrence of a string, a certain character, a word, etc, but I have yet to find a solution for matching the first occurrence of a SET of characters (or in my case, NOT matching a set of characters).

e.g. I have a string as below (in javascript):

var testString = '~!@#$%^&*()_+|}{POIUYTREWQ":?><asdfghjklm,./;[]\=-0987654321`~!@#$%^&*()_+|}{POIUYTREWQ":?><asdfghjklm,./;[]\=-0987654321`~!@#$%^&*()_+|}{POIUYTREWQ":?><asdfghjklm,./;[]\=-0987654321`'

As you can see, there are many, many, many occurrences of weird characters within testString.

I put up a regex match to show me which are the offending characters as below:

var regTest = /[^A-Za-z0-9.,?()@\[\]\-\/ ]/g;
var wrongChar = testString.match(regTest);

Now, my problem is that even though wrongChar nicely returns an array of the non-matched characters, it gives me every occurrence of the characters, as below:

~,!,#,$,%,^,&,*,_,+,|,},{,",:,>,<,;,\,=,`,~,!,#,$,%,^,&,*,_,+,|,},{,",:,>,<,;,\,=,`,~,!,#,$,%,^,&,*,_,+,|,},{,",:,>,<,;,\,=,`

Is there way to give me only the FIRST occurrence of every unwanted character in a quick way(such as a change in my regex), or would I have to create 2 arrays to keep testing if a character has already been saved inside wrongChar(the long method)?

1 Answers1

0

To get only one occurence, make the regex non-global.

To get each character only once, just remove duplicates from the wrongChar result array:

var singleChars = wrongChar.sort().reduce(function(res, x) {
    if (x != res[res.length-1])
         res.push(x);
    return res;
}, []);
Bergi
  • 630,263
  • 148
  • 957
  • 1,375
  • sorry, but I don't really understand the difference between 'one occurrence' and 'each character only once'. Anyway, making the regex non-global only yields me the first unwanted character it finds, rather than the first time for every character. – QwertyForever Jan 11 '13 at 03:12
  • Thanks for replying! I gather from your answer that there is really no shortcut to giving me an array of unwanted characters once? – QwertyForever Jan 11 '13 at 03:14
  • Yes, by "only one occurence" I mean "only the first unwanted char it finds". However, I still don't see what you mean by "getting the first time" of every character - you just get *that* it matched *somewhere*. – Bergi Jan 11 '13 at 03:15
  • like having only one of each character as per your second point in this post. oh never mind. its semantics and complicated stuff..... – QwertyForever Jan 11 '13 at 03:19
  • I have to add that this reduce function requires extra code found on http://www.tutorialspoint.com/javascript/array_reduce.htm if the browser doesn't support javascript 1.8 – QwertyForever Jan 11 '13 at 03:59
  • Of course you could code it manually as well, you need not to use my snippet with ES5-[Array-`reduce`](https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Array/Reduce). There are [many](http://stackoverflow.com/q/1960473/1048572) [ways](http://stackoverflow.com/q/9229645/1048572) to [remove](http://stackoverflow.com/q/2218999/1048572) [duplicates](http://stackoverflow.com/q/840781/1048572)… – Bergi Jan 11 '13 at 04:16