There are multiple Unicode characters that are visually similar, such as:
":" and "꞉" U+A789
"?" and "?" U+FF1F
"*" and "⁎" U+204E
"'", "`", "‘", "’", and "ʻ"
There are also characters with and without diacritical marks, such as:
"c" and "ç" U+00E7
"E" and "É" U+00C9
"I" and "İ" U+00ED
"i" and "ı" U+0131
I'd like to compare text from various sources, with effectively the same words comparing as equal, such as:
"naive" and "naïve"
"facade" and "façade"
"Hawai'i" and "Hawaiʻi"
"don't" and "don’t"
"letter 'A'" and "letter ‘A’" and "letter `A'"
"letter "B"" and "letter "“B”"
Is there a module that provides tests of equivalence between such characters?
import unicode
if x != y and not unicode.same_character(x, y):