4

I want to remove special characters, exist within the string, but except the first one. I did something like bellow and it works if they are not next to each other.

special char set = '❶❷❸❹❺❻❼❽❾❿➀'

My current code is:

let str = '❶Hi dear how❽ are❺ ❽you❼';
const len = str.length;
for(let i = 0; i < len; i++){
  if(i !== 0){
    if(str[i] === '❶' || str[i] === '❷' || str[i] === '❸' ||
    str[i] === '❹' || str[i] === '❺' || str[i] === '❻' || str[i] === '❼' ||
    str[i] === '❽' || str[i] === '❾' || str[i] === '❿' || str[i] === '➀'){
      str = str.replace(str[i], '');
    }
  }
}

console.log('output: ', str);

The above code works well, but if I change str like below then will not work:

let str = '❶Hi dear how❽ are❺❼ ❽❽you❼';

Expected output:

❶Hi dear how are you

Would be better if could be solve this with regex if it can be faster than my solutions

Muhammad
  • 2,572
  • 4
  • 24
  • 46
  • in my question I am intending only to keep one one of the `special char set` if it exist only and only in `index 0` – Muhammad May 17 '20 at 08:30

4 Answers4

3

https://jsben.ch/Hcjqp

Ignore first character, do a straight replace. Benchmark above.

const replace = str => str[0] + str.slice(1).replace(/[❶-➀]+/g, '');
let str = '❶Hi dear how❽ are❺ ❽you❼'.repeat(2000);
console.log(replace(str));
user120242
  • 14,918
  • 3
  • 38
  • 52
  • It won't replace _any_ first occurence of special character. – raina77ow May 17 '20 at 07:50
  • Isn't that what he asked for? If he wanted just the first character, you could just do a replace ignoring the first char. And probably none of these solutions are going to help: it's doubtful the regex will solve his performance problem. – user120242 May 17 '20 at 07:55
  • You're right, it's not quite clear from the question itself. The code shown by OP checks the first letter of the string though. – raina77ow May 17 '20 at 07:56
  • All three answer works for me on chrome and I am using `react-native` and just want pick up the fastest . I will vote up for the answers but I have to select as solved the fastest and stable way. – Muhammad May 17 '20 at 07:59
  • 1
    If you only need to check the first character, it'll be faster to check first character separately and ignore the first character in a straight replace. If not, a lookbehind will be fastest, but won't work on Safari. The performance difference shouldn't be noticeable unless you're doing like several 100k+ replacements – user120242 May 17 '20 at 08:20
  • Thank you I am using it in `react-native` both on `ios and android` – Muhammad May 17 '20 at 08:36
  • 1
    actually now that I'm looking at it, even for non-first character, slicing from the first occurrence and then straight replace would still be faster. – user120242 May 17 '20 at 08:43
  • Thank you very much, I am checking with `peformance.now()` and this looks really same with @GolamMazid solutions. please let me check it on react-native app. – Muhammad May 17 '20 at 08:47
  • Also I changed `[❶-➀]` with `[❶❷❸❹❺❻❼❽❾❿➀]` – Muhammad May 17 '20 at 08:48
  • 1
    shouldn't make a difference. it's the same thing – user120242 May 17 '20 at 08:49
  • did you test on react-native for ios? lookbehind should have problems there – user120242 May 17 '20 at 08:50
  • 1
    +1 from me; I like separating the concerns here. Using starry negated lookbehind for overriding the replacement for the first character is bad both for performance and clarity. – raina77ow May 17 '20 at 09:07
  • Thank you, yeah tested and works on both `android and ios` – Muhammad May 17 '20 at 09:10
  • overlooked one small change that seems to make a difference on very large strings. replacing segments can be faster if they occur – user120242 May 17 '20 at 09:17
  • All answers were great, and I am thankful from all people answered, as I check about performance, this answer and @GolamMazid answer have a same speed on chrome, and both were faster than the other answers, but @GolamMazid answer is not working on `react-native android and ios v:0.61.5`, and for now this answer is my selected answer. – Muhammad May 17 '20 at 09:27
2

The bug is subtle: when you remove a character from your string, you should also decrease the index, like this:

str = str.replace(str[i--], '');

... as you drop that character but leave the cursor at the same place, and move it forward at the very next step. That's why your original code failed to remove the repeated 'blacklisted' characters.

And yes, that's easy to do with regex replace:

const blacklistCharacterClass = /[❶❷❸❹❺❻❼❽❾❿➀]/g;
const rawString = '❶Hi dear how❽ are❺❼ ❽❽you❼';
const refinedString = rawString.replace(blacklistCharacterClass, (c, i) => i ? '' : c);
console.log(refinedString); // ❶Hi dear how are you
raina77ow
  • 103,633
  • 15
  • 192
  • 229
  • Thank you very much for your solution and it works very well, and I up voted your answer, and could you please if you can solve it with `regex` as it is part of my expected. because current solution is a little slow in a long string example 10k chars. – Muhammad May 17 '20 at 07:41
  • 1
    @Muhammad Added the regex option. – raina77ow May 17 '20 at 07:50
  • thank you very much, I am now testing all answer and I am have to select the one do a `faster` job – Muhammad May 17 '20 at 07:53
  • Can you gues which is faster your first solution or second? – Muhammad May 17 '20 at 08:08
  • 1
    measure it with jsperf or jsbench. are you actually running into a performance problem? – user120242 May 17 '20 at 08:11
2

Below Regex will not work in firefox and safari...

regex: ([❶❷❸❹❺❻❼❽❾❿➀])(?<!^[❶❷❸❹❺❻❼❽❾❿➀])

It will replace all special character excpet first...

Code:

str = "❶Hi dear how❽ are❺❼ ❽❽you❼'"

console.log(str.replace(/([❶❷❸❹❺❻❼❽❾❿➀])(?<!^[❶❷❸❹❺❻❼❽❾❿➀])/gm,''))
GolamMazid Sajib
  • 8,698
  • 6
  • 21
  • 39
  • 1
    (?<!^) should be enough actually. But this won't work in Firefox and Safari (lookbehinds [are not supported](https://caniuse.com/#feat=js-regexp-lookbehind) there yet) anyway. – raina77ow May 17 '20 at 07:48
  • 1
    thank you and let me tested, please your answer looks shorter way, hope this done faster the job – Muhammad May 17 '20 at 07:51
  • Good answer. Consider dropping the capture group and, for better readability, put the lookbehind before the match. – Cary Swoveland May 17 '20 at 08:22
2

I have assumed that the first of the special characters is not necessarily at the beginning of the line.

You can convert matches of the following regular expression to empty strings.

(?<=[❶❷❸❹❺❻❼❽❾❿➀].*)[❶❷❸❹❺❻❼❽❾❿➀]

Demo

The regex reads, "match one of the characters of interest that is preceded by a character of interest" (since those are the ones to be replaced with an empty string.

Note that the positive lookbehind cannot be anchored to the beginning of the line.


Since writing the above I've learned that I've misinterpreted the question (see comments below). I will leave it, however.

One way to do the replacements when it is known that the first character of the string is one of the special characters is to replace each match of the regular expression

(?<=.)[❶❷❸❹❺❻❼❽❾❿➀]

with an empty string; that is, replace each special character that is preceded by another character (and therefore is not the first character in the string).

Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
  • thank you very much, your answer make me feel it works faster and is shorter, cuz I need accept the answer do a faster. but here in answer one thing is not work as my expectation, Please see my question I said only first char I mean index 0 should be remain if it is one of special chars – Muhammad May 17 '20 at 08:15
  • for now I just want to up vote your answer – Muhammad May 17 '20 at 08:16
  • 1
    If you insist. With your intended interpretation of "first char" I believe @GolamMazid Sajib has a good regex solution (though I don't think the capture group in his regex is necessary). Best not to edit your question to clarify "first char" as it would make my answer incorrect. You might, however, add a comment to your question that you had intended "first char" to mean the first character in the string. – Cary Swoveland May 17 '20 at 08:21
  • thank you and Ok I will add only a comment under my question to clarify it – Muhammad May 17 '20 at 08:27
  • To be honest, I really think it's easier to go with @user120242 approach both for performance and clarity. True, regexp engine should optimize this, but it's implementation-specific. As a sidenote, can't that be simplified to `(?<!^)`? – raina77ow May 17 '20 at 09:12