For some examples:
These characters are too short or overlap the surrounding characters:
/b5/ີ/foo
/31/ั/foo
/39/᤹/foo
/a3/ᮣ/foo
These are too long to fit into monospace character slot:
/4b/ോ/foo
/23/ᠣ/fo
/61/ᡡ/foo
/86/ᢆ/foo
/ba/຺/foo
Then blank/whitespace/invisible characters would also be considered ones that don't fit well in the URL.
Wondering if there is a simple way to figure out which characters fall into these slots:
- Fits well in URL (latin characters, chinese characters, etc.).
- Too large for monospace (chinese characters, the above examples, etc.).
- Combining character or overlaps surrounding URL characters (examples above).
Maybe by checking some property on the unicode character there is a way to tell this programmatically, so I don't need to go through each character individually and visually check which category it falls into.
Mainly I am looking for which characters need to be either (a) placed on another character (combining characters), or (b) need some extra padding like the examples above, so you can see them in the URL).