If the input string is
Cat fish bannedword bread bánnedword mouse bãnnedword
It should output
Cat fish bread mouse
What would be the best way to do this without slowing down the performance?
If the input string is
Cat fish bannedword bread bánnedword mouse bãnnedword
It should output
Cat fish bread mouse
What would be the best way to do this without slowing down the performance?
There are number of ways you can use but non of them (at least as far as I know) will work without certain performance cost.
The most obvious way is to remove the accented characters first and then use simple string.Replace(). As for removing accented characters this or this stackoverflow questions should help you.
Other approach could be splitting the string into an array of strings (each string being separate word) and then removing each word that equals the 'bannedword' using a parameter that makes Equals() method ignore accents.
Something like:
string[] splittedInput = input.Split(' ');
StringBuilder output = new StringBuilder();
foreach(string word in splittedInput)
{
if(string.Compare(word, bannedWord, CultureInfo.CurrentCulture, CompareOptions.IgnoreNonSpace) == false)
{
output.Append(word);
}
}
string s_output = output.ToString();
//I've not tested it in Visual Studio so there might be mistakes... (A LINQ could also simplify it (and potentially enable pluralization)).
And finally, it should be possible to come up with a clever regex solution (probably the fastest way) but not being an expert on regex I can't help you with that (this might point you in the right direction (if you know at least something about regexes)).