1

I just started with Python and want to find the binary code for any given character in a text file. The problem I encountered is when it prints the binary there is a "b" in the binary.

file = open("textfile.txt","w")
file.write("Hello World ")
file.write("This our new text file")
file.write("and this is another line. ")
file.write("Why? Because we can.")
file.close()
with open("textfile.txt") as file:
    data=file.readline()
data_vector = list(data)
binary_data_vector = map(bin, bytearray(data_vector))
print(binary_data_vector)

This is the output I am currently getting:

['0b1001000', '0b1100101', '0b1101100', '0b1101100', '0b1101111', '0b100000', '0b1010111', '0b1101111', '0b1110010', '0b1101100', '0b1100100', '0b100000', '0b1010100', '0b1101000', '0b1101001', '0b1110011', '0b100000', '0b1101111', '0b1110101', '0b1110010', '0b100000', '0b1101110', '0b1100101', '0b1110111', '0b100000', '0b1110100', '0b1100101', '0b1111000', '0b1110100', '0b100000', '0b1100110', '0b1101001', '0b1101100', '0b1100101', '0b1100001', '0b1101110', '0b1100100', '0b100000', '0b1110100', '0b1101000', '0b1101001', '0b1110011', '0b100000', '0b1101001', '0b1110011', '0b100000', '0b1100001', '0b1101110', '0b1101111', '0b1110100', '0b1101000', '0b1100101', '0b1110010', '0b100000', '0b1101100', '0b1101001', '0b1101110', '0b1100101', '0b101110', '0b100000', '0b1010111', '0b1101000', '0b1111001', '0b111111', '0b100000', '0b1000010', '0b1100101', '0b1100011', '0b1100001', '0b1110101', '0b1110011', '0b1100101', '0b100000', '0b1110111', '0b1100101', '0b100000', '0b1100011', '0b1100001', '0b1101110', '0b101110']

So my question is how can I get rid of "b" so it prints only 8 bits for each character. And if you know why this happens, please explain!.

PM 2Ring
  • 54,345
  • 6
  • 82
  • 182
Kylian
  • 59
  • 7
  • `bin(..)` generates a **string** for an integer of the form `0b...` with the remainder bits. This is done so you can immediately inject the output back in the interpreter. – Willem Van Onsem Mar 23 '17 at 12:49
  • Some string slicing and list comprehension? `bits = [just_bits[2:] for just_bits in binary_data_vectory]` – Avantol13 Mar 23 '17 at 12:50
  • http://stackoverflow.com/questions/41641091/concatenate-numbers-in-binary – Ma0 Mar 23 '17 at 12:59
  • i am using version Python 2.7.13 – Kylian Mar 23 '17 at 13:00
  • BTW, you should not use `file` as a variable name in Python 2 because that shadows the built-in `file` type. Shadowing the built-in type names (like `list`, `str`, `set`, `file`, etc) can lead to mysterious bugs with cryptic error messages. – PM 2Ring Mar 23 '17 at 13:09

1 Answers1

2

The bin function returns the binary representation of a number, prefixed with 0b which makes such a representation suitable directly for input in Python source code.

It is not what we want most of the time, so one valid thing would be to strip the first 2 chars of each string. But then there is another issue with bin: it only encodes as many bits of a number as needed to represent it. That means that ASCII characters would be encoded in 7 binary digits, when we usually want 8 bits on the representation.

The alternative is to use string formatting itself for creating the representation. The .format string method allows you not only to ask for the binary (rather than decimal) representation of a number be printed, but further allows you to specify how many digits you want by typing in the leading zeros in the format string:

binary_data_vector = ["{:08b}".format(number) for number in bytearray(data_vector)]

(Take time to look at list comprehensions - they look strange at first, but are far more powerful and readable than using the map and filter functions - the above line, for example, would require specifying a lambda function to be written using map as you had it)

glibdud
  • 7,550
  • 4
  • 27
  • 37
jsbueno
  • 99,910
  • 10
  • 151
  • 209
  • 1
    Another thing: I'd suggest learning Python3 instead of Python 2 at this point in time. I can infer you are using Python 2 - because in Python 3 your "casting" to bytearray would not work as is (but would not be needed as well, if the file is read correctly) – jsbueno Mar 23 '17 at 13:08