0

Simple question. Is there a way to read a file in the form of bits (and not bytes or text)? If not, is there a way to convert the bytes that I get from the Python IO (binary mode) into bits?

To add a little context, I am not just trying to read the bits, but also modify them in specific cases and write the modified ones to a new file. I can work with the bits being either in the form of a string or an integer.

EDIT: The data in the original file will be in the form of simple text. What I am trying to do is get the bits that make up each character and modify them to suit my needs.

martineau
  • 119,623
  • 25
  • 170
  • 301
Thsise Faek
  • 373
  • 1
  • 7
  • 20
  • 2
    Can't you read them as bytes, then manipulate them with bitwise operations? – Carcigenicate Feb 26 '15 at 19:39
  • It would be much easier for me if they were just bits. I have to work with all bits in the file, not just pairs or groups of them – Thsise Faek Feb 26 '15 at 19:41
  • Sorry, never done that. – Carcigenicate Feb 26 '15 at 19:42
  • Can you please add some more context as to what those bits represent? Data rarely occurs just as an unstructured stream of bits, but is usually packed in some way. If that's the case, you may want to look at the [`struct`](https://docs.python.org/2/library/struct.html) module. – Lukas Graf Feb 26 '15 at 19:51
  • 1
    Read in bytes, convert bytes into bits, manipulate bits, convert bits back to bytes, write to file. – Gillespie Feb 26 '15 at 20:01
  • 1
    That is what I am trying to do. What I am trying to figure out is how to convert the bytes into bits. – Thsise Faek Feb 26 '15 at 20:06
  • 1
    Sounds like the [`bitarray`](https://pypi.python.org/pypi/bitarray/) module is what you want. But depending on what exactly you want to do, you can also just turn a represent a byte in binary by doing something like `bin(ord('A'))`. See [this answer](http://stackoverflow.com/a/25611913/1599111) on how to iterate over a file byte by byte. – Lukas Graf Feb 26 '15 at 20:10
  • 1
    related: [Python - How can I change bytes in a file](http://stackoverflow.com/q/28520922/4279). The [mmap code example shows how you could modify 6th bit (little-endian) in 4th byte in every 2nd line in a file](http://stackoverflow.com/a/28522826/4279). – jfs Feb 26 '15 at 21:21
  • 1
    Do you want to add/remove bits (so that the position of all the following bytes has to change)? Does "each character" mean "each byte" in your case? If you want to support Unicode then byte (usually `0x100` distinct values) is not large enough to represent all Unicode codepoints (`sys.maxunicode`). What kind of modifications do you need? Could you provide an example input character -> result? – jfs Feb 26 '15 at 21:28
  • You may be able to use this [bitio](http://rosettacode.org/wiki/Bitwise_IO#Python) module. – martineau Feb 26 '15 at 21:37
  • The `ord` function converts a character into an integer, and `chr` does the reverse. Once you have an integer representation the modifications you speak of are trivial. – Mark Ransom Feb 26 '15 at 21:53
  • @MarkRansom: If OP has bytes *"from the Python IO (binary mode)"* then enumerating them produces `int`s on Python 3: `list(b'abc') == [97, 98, 99]`. And usually you want to batch (vectorize) operations e.g., [`xor`ing each bit could be done one byte at a time](http://stackoverflow.com/a/20570990/4279) and if it is not enough [there are faster solutions -- note: the fastest solutions xor 32 or 64 or even 128 bit at once](http://stackoverflow.com/q/2119761/4279) – jfs Feb 26 '15 at 22:58

1 Answers1

1

There is a python package "bitstring". The manual is here: https://pythonhosted.org/bitstring/ There is a link on that web page to download the package. I have used it and it can read bits from a file.

Marichyasana
  • 2,966
  • 1
  • 19
  • 20