I've noticed something odd about for loops iterating over Python strings. For char strings, you get strings with a single character each.
>>> for c in 'Hello':
print(type(c), repr(c))
<class 'str'> 'H'
<class 'str'> 'e'
<class 'str'> 'l'
<class 'str'> 'l'
<class 'str'> 'o'
For byte strings, you get integers.
>>> for c in b'Hello':
print(type(c), repr(c))
<class 'int'> 72
<class 'int'> 101
<class 'int'> 108
<class 'int'> 108
<class 'int'> 111
Why do I care? I'd like to write a function that takes either a file or a string as input. For a text file/character string, this is easy; you just use two loops. For a string input the second loop is redundant, but it works anyway.
def process_chars(string_or_file):
for chunk in string_or_file:
for char in chunk:
# do something with char
You can't do the same trick with binary files and byte strings, because with a byte string the results from the first loop are not iterable.