0

I have a string -

125DF885DF44é112846522FF001

I want to remove é from the string. When I search online I get solutions to remove the accents from é and returns e.

The diacritic character can come anywhere in the string and not in fixed place, also can be more than one.

How do I remove those?

Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
Alisha
  • 88
  • 7

2 Answers2

2

You can use this

string s = "125DF885DF44é112846522FF001";
string s1 = s.Replace("é","");
SlobodanT
  • 406
  • 3
  • 7
  • OMG, It didn't even occur to me to even try the replace method. Thank you so much. – Alisha May 17 '22 at 06:56
  • If you have more than just é, you can put those in array or list and loop thru that collection and replace every diacritic. Regex is also nice approach as @Jeremy Lakeman mentioned – SlobodanT May 17 '22 at 07:00
  • You, probably, want `string s1 = s.Replace("é","e");` - note `"e"` - to *substitute* `"é"` by `"e"`, not totally remove `"é"` – Dmitry Bychenko May 17 '22 at 07:11
2

In general case, we can remove symbols of unicode NonSpacingMark range:

  1. We turn each symbol into pair: symbol + its mark(s) (that's the diacritics)
  2. We remove marks
  3. Combine symbols back

Code:

using System.Linq;

...

string source = "125DF885DF44é112846522FF001";

string result = string.Concat(source
    .Normalize(NormalizationForm.FormD)
    .Where(c => CharUnicodeInfo.GetUnicodeCategory(c) != 
                UnicodeCategory.NonSpacingMark))
  .Normalize(NormalizationForm.FormC);
Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215