If you want to modify the lsb of your bytes, there is no point in expressing the value to a binary string. Effectively, you would be doing something along the lines (in pseudocode):
byte = '\x6h'
binary = convert_to_bits(byte) # some way of getting 1s and 0s in a string
binary = binary[:7] + my_bit_string
byte = convert_to_byte(binary)
There are more direct and efficient ways to modify a bit value and that's with bitwise operators. For example, let's say we want to change 01001001 (decimal 73) to 01001000. We want to create a bitmask 11111110, which in decimal is the value 254, and AND
it with our value.
>>> value = 73 & 254
>>> value
72
>>> '{0:08b}'.format(value)
'01001000'
When you embed a bit to a byte, the lsb may change or it may not. There are many ways to go about it, but the most direct is to zero out the lsb and then overwrite it with your bit with an OR
(very versatile if you also want to embed in multiple bits).
byte = (byte & 254) | my_bit
You could also zero out the lsb with a right shift
, followed by a left shift
, but this takes 2 operations instead of one.
byte = ((byte >> 1) << 1) | my_bit
Or you could check whether the lsb and your bit are different and flip it with a XOR
. However, this method uses branches and is the least efficient.
if (byte & 1) != my_bit:
byte = byte ^ 1
# no need to do anything if they are the same
So, all you need to do is convert your bytes to an array of integers. You could use [ord(byte) for byte in frame]
, but there are more efficient built-in ways. With bytearray()
and bytes()
:
>>> frame = '\x0f\x02\x0e\x02\xf7\x00\xf7\x00T\xffT\xff'
>>> frame_bytes = bytearray(frame)
>>> frame_bytes[0]
15
>>> frame_bytes[0] = 14 # modify
>>> bytes(frame_bytes) # convert back to bytes
'\x0e\x02\x0e\x02\xf7\x00\xf7\x00T\xffT\xff'
With array.array()
(this seems to be a tiny wee bit slower for hundred thousands of bytes):
>>> import array
>>> frame = '\x0f\x02\x0e\x02\xf7\x00\xf7\x00T\xffT\xff'
>>> frame_bytes = array.array('B', frame)
>>> frame_bytes[0]
15
>>> frame_bytes[0] = 14 # modify
>>> frame_bytes.tostring() # convert back to bytes; in Python 3 use `tobytes()`
'\x0e\x02\x0e\x02\xf7\x00\xf7\x00T\xffT\xff'
Example of embedding and extracting.
frame = '\x0f\x02\x0e\x02\xf7\xf7T\xffT\xff'
bits = [0, 0, 1, 1, 0]
# Embedding
frame_bytes = bytearray(frame)
for i, bit in enumerate(bits):
frame_bytes[i] = (frame_bytes[i] & 254) | bit
frame_modified = bytes(frame_bytes)
# Extraction
frame_bytes = bytearray(frame_modified)
extracted = [frame_bytes[i] & 1 for i in range(5)]
assert bits == extracted
If your secret is a string or series of bytes, it's easy to convert them to a list of 1s and 0s.
Finally, make sure you don't modify any header data, as that may make the file unreadable.