I want to replace accented letters with equivalent basic letters using regular expressions.
Example: Û --> U
I have seen solutions where they search for ALL accented letters. But I'm looking for a better and more direct solution like :
someone suggested this in JAVA:
\p{Diacritic}/gu
or this in python
def remove_diacritics(text):
"""
Returns a string with all diacritics (aka non-spacing marks) removed.
For example "Héllô" will become "Hello".
Useful for comparing strings in an accent-insensitive fashion.
"""
normalized = unicodedata.normalize("NFKD", text)
return "".join(c for c in normalized if unicodedata.category(c) != "Mn")
But Im looking for a way without having to go through normalization