2

I have designed a compression algorithm that operates on the byte level representation of a file.

I want to do the following procedure:

Read k bytes from

Run algorithm on byte array of size k outputting compressed material

Write compressed material to a new file

Repeat until file is exhausted.

How do I read the first k bytes of a file?

Furthermore I want these bytes to be in binary format. I noticed that python automatically convert bytes into string character when I use the open('filename', 'rb') method, I want to actually see the bytes in the form of (0101101) and not as an integer, string, etc...

Then I want to directly write, in this binary format, to a new file.

Sidharth Ghoshal
  • 658
  • 9
  • 31
  • How you "see" the bytes is not relevant to what data they contain. If you open a file in binary mode you can read bytes from it and do as you please with them. – BrenBarn Dec 28 '14 at 07:08
  • 1
    i think this post covers most of what you are asking: http://stackoverflow.com/questions/1035340/reading-binary-file-in-python – Ganesh Kamath - 'Code Frenzy' Dec 28 '14 at 07:08

1 Answers1

4

I noticed that python automatically convert bytes into string character when I use the open('filename', 'rb') method

It is incorrect. open('filename', 'rb').read(k) returns upto k bytes from the file. The value is a bytes object. You can print it in the binary ("01") format. See Convert Binary to ASCII and vice versa (Python).

You can write the read bytes object to a new file as is:

open('output', 'wb').write(bytes_object)

There is no conversion of any kind (no '\n' -> '\r\n', no decoding/encoding using a character encoding -- nothing).

Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670