5

I read a binary file and get an array with characters. When converting two bytes to an integer I do 256*ord(p1) + ord(p0). It works fine for positive integers but when I get a negative number it doesn't work. I know there is something with the first bit in the most significant byte but with no success.

I also understand there is something called struct and after reading I ended up with the following code

import struct

p1 = chr(231)
p0 = chr(174)

a = struct.unpack('h',p0+p1)

print str(a)

a becomes -6226 and if I swap p0 and p1 I get -20761.

a should have been -2

N.N.
  • 8,336
  • 12
  • 54
  • 94
user764640
  • 181
  • 2
  • 5

6 Answers6

2

-2 is not correct for the values you have specified, and byte order matters. struct uses > for big-endian (most-significant byte first) and < for little-endian (least-significant byte first):

>>> import struct
>>> struct.pack('>h',-2)
'\xff\xfe'
>>> struct.pack('<h',-2)
'\xfe\xff'
>>> p1=chr(254) # 0xFE
>>> p0=chr(255) # 0xFF
>>> struct.unpack('<h',p1+p0)[0]
-2
>>> struct.unpack('>h',p0+p1)[0]
-2
Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251
1

Generally, when using struct, your format string should start with one of the alignment specifiers. The default, native one differs from machine to machine.

Therefore, the correct result is

>>> struct.unpack('!h',p0+p1)[0]
-20761

The representation of -2 in big endian is:

1111 1111 1111 1110 # binary
  255        254    # decimal bytes
  f   f    f    e   # hexadecimal bytes

You can easily verify that by adding two, which results in 0.

Community
  • 1
  • 1
phihag
  • 278,196
  • 72
  • 453
  • 469
0

With the first method (256*ord(p1) + ord(p0)), you could check to see if the first bit is 1 with if p1 & 0x80 > 0. If so then you'd use p1 & 0x7f instead of p1 and then negate the end result.

Toomai
  • 3,974
  • 1
  • 20
  • 22
0

Your original equation will work fine if you use masking to take off the extra 1 bits in a negative number:

256*(ord(p0) & 0xff) + (ord(p1) & 0xff)

Edit: I think I might have misunderstood your question. You're trying to convert two positive byte values into a negative integer? This should work:

a = 256*ord(p0) + ord(p1)
if a > 32767: # 0x7fff
    a -= 65536 # 0x10000
Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
  • 1
    Since those are already 8-bit quantities, masking off the lower 8-bits doesn't do anything. – kindall Oct 18 '11 at 18:58
  • To mask the sign bit, you would use `& 0x7F`, no? –  Oct 18 '11 at 19:12
  • @kindall, the original question speaks of negative numbers. Negative numbers are sign-extended so the upper bits contain `1` instead of `0`, thus masking them becomes necessary. I'm still not quite sure how `ord` returns a negative number though. – Mark Ransom Oct 18 '11 at 19:13
  • @Colt45, if you use `& 0x7f` you will lose the 8th bit. You don't want to remove the sign bit, you want to remove its extensions to the left. – Mark Ransom Oct 18 '11 at 19:14
  • 'ord()' doesn't return a negative number -- the two-byte numbers he's trying to decode are negative. – kindall Oct 18 '11 at 19:20
0

For the record, you can do it without struct. Your original equation can be used, but if the result is greater than 32767, subtract 65536. (Or if the high-order byte is greater than 127, which is the same thing.) Look up two's complement, which is how all modern computers represent negative integers.

p1 = chr(231)
p0 = chr(174)

a = 256 * ord(p1) + ord(p0) - (65536 if ord(p1) > 127 else 0)

This gets you the correct answer of -6226. (The correct answer is not -2.)

kindall
  • 178,883
  • 35
  • 278
  • 309
0

If you are converting values from a file that is large, use the array module.

For a file, know that it is the endianess of the file format that matters. Not the endianess of the machine that either wrote it or is reading it.

Alex Martelli, of course, has the definitive answer.

Community
  • 1
  • 1