Your approach won't work. All you're doing is converting the ones and zeros of the encrypted string into \x01
and \x00
bytes, neither of which are printable.
Anyway, this isn't really encryption; it's just a flawed binary encoding where different inputs can give you the same output:
encode('0p') == '1100001110000'
encode('a0') == '1100001110000'
encode('\x01!\x18\x00') == '1100001110000'
encode('password') == '11100001100001111001111100111110111110111111100101100100'
encode('p0<|>>?%$') == '11100001100001111001111100111110111110111111100101100100'
As a strategy for decoding the output of this function, you need to make some assumptions about the original input.
To start with, it might be reasonable to assume that the message contains only printable ASCII values in the range from 32 to 126. In that case, each input character will be encoded as a chunk of either six or seven binary digits in the output, and each chunk will start with 1
(because format(ord(i), 'b')
won't begin with 0
unless i==0
).
For example, suppose you want to decode the following string:
11010001100101110110011011001101111
The first chunk must be seven bits, otherwise the next chunk would begin with 0
, which is impossible.
1101000 1100101110110011011001101111
For the next chunk, it looks like we can consume 6 bits and leave a string that begins with 1
:
1101000 110010 1110110011011001101111
But if we do that, then it would be impossible to extract 6 or 7 further bits and leave a string that starts with 1
:
### invalid ###
1101000 110010 111011 0011011001101111
### invalid ###
1101000 110010 1110110 011011001101111
This would suggest that an earlier assumption was incorrect. Using a backtracking algorithm, you can identify all the valid partitionings of the encoded data.
Since a long encoded string could have a large number of valid partitionings, you might also need to rank each possible input string based on heuristics such as the number of alphabet characters minus the number of non-alphabet characters. In the above example, password
would score 8 points, but p0<|>>?%$
would score −7. The correct input string is likely to be among the highest-scoring alternatives.