0

I'm working on a Python Application that reads from a binary file, pulls a byte that then references a list. This is essentially what I'm doing.

list = [x for x in range(0, 340)]

index = struct.unpack('<b', file.read(1))

print(list[index])

The problem is that the file I'm reading from, the 8th bit denotes a positive value, rather than making the number negative. For example, I hope to read b'11111111' to read as 255 and not as the negative number that it's producing. I'm not familiar with the struct module, and not sure if I'm doing something wrong with using the module, or if it's a feature of Python that I'm not sure how to create a workaround for.

  • 1
    Look at the [formatting chars](https://docs.python.org/2/library/struct.html#format-characters) - `b` is `signed` - `B` is unsigned... Also - what's the point of having 340 elements in `list` when you only have 256 values to index into it with – Jon Clements Jan 29 '15 at 02:53
  • Sorry, my example code's not great/full picture. Essentially there is a check earlier in the file, that if True read two bytes to get the index, rather than just 1. – Almost Surely Jan 29 '15 at 02:57
  • I'm also going to ask a noob question here: if I'm understanding you correctly a signed char means that 8th bit is a negative flag where as unsigned means it's not? – Almost Surely Jan 29 '15 at 02:58
  • @Almost: yes, basically. Actually signed chars are sign-bit + 7 value bits for a range of -127 to +127. See [_Range of signed char_](http://stackoverflow.com/questions/3898688/range-of-signed-char). – martineau Jan 29 '15 at 03:02
  • Is the question in your post answered now? – martineau Jan 29 '15 at 03:06
  • This solved my issue. Definitely thanks. I'm essentially trying to read a file written with a C program, in Python, with no knowledge of C. So the help is definitely appreciated. Is there a way to mark this question as answered without actual answers? – Almost Surely Jan 29 '15 at 03:06
  • I'm voting to close this question as off-topic because OP's question was answered in comments. – martineau Jan 29 '15 at 03:09
  • JonClements or Chewie, do one of you want to put your comment as an answer so I can select it, thus giving someone credit? – Almost Surely Jan 29 '15 at 03:14
  • If you are doing this one byte at a time, it's simpler to just use `ord()`. – John La Rooy Jan 29 '15 at 05:06

1 Answers1

2

For an unsigned character, you'd want to use the B format character, not b (see Format Characters section in the struct module documentation. Since unpack() always returns a tuple of values, even if only one is specified by the format string, a [0] can be added to the end of the expression as shown to retrieve that first and only element.

>>> index = struct.unpack('<B', bytearray([0b11111111,]))[0]
>>> print(index)
255
martineau
  • 119,623
  • 25
  • 170
  • 301
  • The output of `unpack` is a tuple. Rather than converting it to a list, you probably want to simply take the first element: `struct.unpack(...)[0]`. – Mark Ransom Jan 29 '15 at 04:27
  • @Mark: Good point, I thought I was just presenting the result in the same terms the OP code was doing, but now see they used `print([index])` which would have produced `[(255,)]` -- so I'll update my answer. – martineau Jan 29 '15 at 07:21