3

We can convert any digital file into binary file.

I have a text file of 1MB,

I want to convert it to a binary string and see the output as a binary number and the vice versa,

in other words, if I have binary number, I want to convert it to a text file.

How could I do that in Python? is there a standard way to do this?

Now in this forum there are some posts (1,2,3, 4 ) on this but none of them answer properly to my question.

Michael
  • 191
  • 3
  • 16
  • 1
    https://stackoverflow.com/questions/18815820/convert-string-to-binary-in-python – Michael Mar 14 '20 at 22:05
  • Does this answer your question? [Python read text file as binary?](https://stackoverflow.com/questions/30563177/python-read-text-file-as-binary) –  Mar 14 '20 at 22:06
  • @NorthernSage I don't see how it does "if I have binary number, I want to convert it to a text file". – Michael Mar 14 '20 at 22:08
  • Well, both binary and text files contain a series of bits, the bits in text files represent characters, while the bits in binary files represent custom data. So in the end, everything is binary and what matter is how you interpret that data in your scripts. Converting a "binary number to a text file" is a bit of a confusing statement. Do you want to save a binary representation of a number in string format in a text file? Please, give some examples of what you want to convert to what format. –  Mar 14 '20 at 22:45
  • Check this thread as well: https://stackoverflow.com/questions/8928240/convert-base-2-binary-number-string-to-int –  Mar 14 '20 at 22:51
  • 1
    Question seems valid; just should be worded differently so is comprehensible to a technical mind who will answer it. – Zimba Oct 02 '20 at 12:42
  • @Zimba feel free to edit. – Michael Oct 02 '20 at 12:43
  • I can answer with other languages but question requires python, so I'll have to come back with python – Zimba Oct 02 '20 at 12:52

2 Answers2

1

The "text file" mentioned seems to refer to ASCII file. (where each character takes up 8 bits of space).

2nd line "convert it to a binary string" could mean ASCII representation of the text file, giving a sequences of bytes to be "output as a binary number" (similar to public key cryptography where text is converted to a number before encryption) eg.

text = 'ABC '
for x in text:
  print(format(ord(x), '08b'), end='')

would give binary (number) string: 01000001010000100100001100100000
which in decimal is: 1094861600

The 3rd line would mean to (byte) sequence a binary number & display the equivalent ASCII characters (for each 8-bit sequence) eg. 0x41 to be replaced with 'A' (as output) (The assumption here would be that each number would map to a printable ASCII ie. text character, and the given binary number has a multiple of 8 digits).

eg. To reverse (convert binary number to text):

binary = "01000001010000100100001100100001"
#number of characters in text
num = len(binary)/8 

for x in range(int(num)):
  start = x*8
  end = (x+1)*8
  print (chr(int(str(binary[start:end]),2)), end='')
print()

would give the text: ABC!

For a 1MB text file, you'd split the text string into chunks your machine can handle eg. 32 bits (before converting)

Tested in Python IDE

Zimba
  • 2,854
  • 18
  • 26
0

See https://docs.python.org/3/library/codecs.html#standard-encodings for a list of standard string encodings, because the conversion depends on the encoding.

These functions will help to convert between bytes/ints and strings, defaulting to UTF-8.

The example provided uses the Hangul character "한" in UTF-8.


def bytes_to_string(byte_or_int_value, encoding='utf-8') -> str:
    if isinstance(byte_or_int_value, bytes):
        return byte_or_int_value.decode(encoding)
    if isinstance(byte_or_int_value, int):
        return chr(byte_or_int_value).encode(encoding).decode(encoding)
    else: 
        raise ValueError('Error: Input must be a bytes or int type')

def string_to_bytes(string_value, encoding='utf-8') -> bytes:
    if isinstance(string_value, str):
        return bytes(string_value.encode(encoding))
    else: 
        raise ValueError('Error: Input must be a string type')

int_value = 54620
bytes_value = b'\xED\x95\x9C'
string_value = '한'

assert bytes_to_string(int_value) == string_value
assert bytes_to_string(bytes_value) == string_value
assert string_to_bytes(string_value) == bytes_value
ELinda
  • 2,658
  • 1
  • 10
  • 9