I stumbled upon http://mortoray.com/2013/11/27/the-string-type-is-broken/
And to my horror...
print(len('noe\u0308l')) # returns 5 not 4
However I found https://stackoverflow.com/a/14682498/1267259, Normalizing Unicode
from unicodedata import normalize
print(len(unicodedata.normalize('NFC','noe\u0308l'))) # returns 4
But what do I do with the Schrödinger's cats?
print(len('')) # returns 4 not 2
(side question: in my text editor when I'm trying to save I get a "utf-8 codec can't encode character x in position y: surrogates not allowed" but in the command prompt I can paste and run code with those characters, I assume it is because the cats exist on a different quantum level (SMP) but how do I normalize them?)
Is there anything else I should do to make sure all characters are counted as "1"?