0

I need to turn bytes from a file into a string of 1's and 0's,

file = open("Some_File.jpg","rb")
data = file.read()
file.close()
binary = some_function(data)
print(binary)
>>> 0100101000010101001...

I've managed to get something equivalent by converting the bytes into Base64 first, however this makes the size of the string very long. Other questions I have looked at are about turning binary strings into bytes, but I can't find any of the opposite.

This question was marked as a duplicate of another question, however this question is turning a string into binary. If I wanted to do that, I would just convert it into Base64, however it makes it far too long. I need a way to to 'bytes' directly into a string of 1's and 0's

  • What do you mean by "makes it far too long"? Are you worried about the time it takes to do this, or the literal length of the resulting binary string? The later will be the same by definition (`8 * number_of_bytes` characters), regardless of how you convert to binary. – Aleon Nov 11 '19 at 09:53
  • 1
    "however this makes the size of the string very long" How long are you *expecting* the result to be, and why? And why do you think that "converting bytes directly" would produce a shorter result than converting "a string"? And why are you expecting base64 to be helpful here at all? – Karl Knechtel Nov 11 '19 at 09:54
  • @Aleon Both, really, as the file I'm testing it on is about 4KB, and the result of the file is 49KB, and it takes about a second to go through it –  Nov 11 '19 at 09:56
  • @Karl Knechtel I was using Base64 as a way to turn the bytes into some sort of string, and then turning that string into binary –  Nov 11 '19 at 09:56
  • I'm using the binary string it produced in an algorithm I'm creating that turns binary data into a sort of 'Base 3.5' format that takes up half of the space of a normal string of 1's and 0's –  Nov 11 '19 at 09:59
  • I think you are confused about what data is and how it works. It will take eight 1s and/or 0s to represent a byte in that textual form, and it will take at least one byte to represent either a 1 or a 0 character in the output, so you should expect the output to expand by at least a factor of 8. The top answer on the linked question already shows you how to convert bytes directly (specifically from a `bytearray`, but a `bytes` object works the same way for our purposes). – Karl Knechtel Nov 11 '19 at 10:01
  • There is not really such a thing as "binary data". Data is just data. A sequence of `1` and `0` symbols is a *representation of* the data. – Karl Knechtel Nov 11 '19 at 10:02
  • Apologies, I misread the file sizes, the original image is 13KB, and the text file that contains the binary is 129KB –  Nov 11 '19 at 10:03
  • @KarlKnechtel So, how do I turn that data into a string of the 1's and 0's? –  Nov 11 '19 at 10:04
  • Just to be clear here. The original image is *13KB of data*, regardless of representation. If you convert this to a sequence of *character* 0-s and 1-s, you will get a file with *8 * 13KB size*. – Aleon Nov 11 '19 at 10:05
  • You turn it into a *string* using any of the methods shown in the linked question. (You cannot make a "string literal", because that means something that you typed yourself between quote marks that is part of your source code.) The result is *expected* to be larger than the original file, about eight times as large (or more, depending on the exact formatting you use). – Karl Knechtel Nov 11 '19 at 10:05
  • Yes, however, I'm trying to get the length of the binary string to be as short as possible, as I can still get 104KB minimum –  Nov 11 '19 at 10:06
  • Please show the *complete, exact* code you tried, and the first 100 or so characters of the output. Also indicate how those characters differ from the characters you expect to see. – Karl Knechtel Nov 11 '19 at 10:07
  • https://stackoverflow.com/questions/60579197/python-bytes-to-bit-string#61106380 – Cody Tookode Mar 01 '23 at 19:27

1 Answers1

0

Based on the discussion in the comments, I am not convinced that what you ask about is what you actually want to do. However assuming you want to convert a python bytestring to a string of literal zeros and ones, here's one way to do it:

import itertools

def bytes_to_bits(bytes_to_print: bytes):
    bits = [
        ["1" if byte & 2 ** i else "0" for i in range(7, -1, -1)]
        for byte in bytes_to_print
    ]
    return ''.join(itertools.chain.from_iterable(bits))


if __name__ == "__main__":
    print(bytes_to_bits(b"ABC"))
Aleon
  • 311
  • 2
  • 10
  • This was exactly what I needed, if you wouldn't mind, could you tell me how I should have worded the question, if this is what I wanted. –  Nov 11 '19 at 10:19
  • Okay, and... how does the result of this code differ from the result of the code you actually tried? – Karl Knechtel Nov 11 '19 at 10:22
  • The binary string is shorter! –  Nov 11 '19 at 10:24
  • @Pomegranates-And-Python difficult to tell you, the question in general suggested that there is a misunderstanding somewhere. Your mention of Base64 for example, leads me to think that you are not entirely familiar with how computers handle data. Your Base64 string is longer, because Base64 "wastes" bits on each byte, just so that the given byte can be represented with printable characters. – Aleon Nov 11 '19 at 10:26
  • I know how computers handle data, I just don't look at python that way, I just think: 'Hey, I need to turn 'x' into 'y', what function does that...? Oh! Base64 makes 'bytes' into a string of letters, and then I turn it into binary!' –  Nov 11 '19 at 10:32
  • I'm not the most 'pythonic' when it comes to my programs, if I'm honest –  Nov 11 '19 at 10:33
  • Updated the answer with an implementation that doesn't use `+` and performs a lot better for long binaries. Python wizards feel free to contribute. – Aleon Nov 11 '19 at 10:45