Byte sequences in Python are represented using strings. The series of letters and symbols that you see when you print out a byte sequence is merely a printable representation of bytes that the string contains. To make use of this data, you usually manipulate it in some way to obtain a more useful representation.
You can use ord(x)
or bin(x)
to obtain decimal and binary representations, respectively:
>>> f = open('/tmp/IMG_5982.JPG', 'rb')
>>> data = f.read(10)
>>> data
'\x00\x00II*\x00\x08\x00\x00\x00'
>>> data[2]
'I'
>>> ord(data[2])
73
>>> hex(ord(data[2]))
'0x49'
>>> bin(ord(data[2]))
'0b1001001'
>>> f.close()
The 'b'
flag that you pass to open()
does not tell Python anything about how to represent the file contents. From the docs:
Append 'b' to the mode to open the file in binary mode, on systems that differentiate between binary and text files; on systems that don’t have this distinction, adding the 'b' has no effect.
Unless you just want to look at what the binary data from the file looks like, Mark Pilgrim's book, Dive Into Python, has an example of working with binary file formats. The example shows how you can read IDv1 tags from an MP3 file. The book's website seems to be down, so I'm linking to a mirror.