2

I need to calculate a checksum for a hex serial word string using XOR. To my (limited) knowledge this has to be performed using the bitwise operator ^. Also, the data has to be converted to binary integer form. Below is my rudimentary code - but the checksum it calculates is 1000831. It should be 01001110 or 47hex. I think the error may be due to missing the leading zeros. All the formatting I've tried to add the leading zeros turns the binary integers back into strings. I appreciate any suggestions.

    word = ('010900004f')

    #divide word into 5 separate bytes
    wd1 = word[0:2] 
    wd2 = word[2:4]
    wd3 = word[4:6]
    wd4 = word[6:8]
    wd5 = word[8:10]

    #this converts a hex string to a binary string
    wd1bs = bin(int(wd1, 16))[2:] 
    wd2bs = bin(int(wd2, 16))[2:]
    wd3bs = bin(int(wd3, 16))[2:]
    wd4bs = bin(int(wd4, 16))[2:]

    #this converts binary string to binary integer
    wd1i = int(wd1bs)
    wd2i = int(wd2bs)
    wd3i = int(wd3bs)
    wd4i = int(wd4bs)
    wd5i = int(wd5bs)

    #now that I have binary integers, I can use the XOR bitwise operator to cal cksum
    checksum = (wd1i ^ wd2i ^ wd3i ^ wd4i ^ wd5i)

    #I should get 47 hex as the checksum
    print (checksum, type(checksum))
user3284986
  • 29
  • 1
  • 2
  • I think this has been addressed before [see this question](http://stackoverflow.com/questions/16926130/python-convert-to-binary-and-keep-leading-zeros) – PyNEwbie Mar 30 '14 at 01:55
  • 0x47 != 0b1001110. Very few odd numbers end in `0` in their binary representation. – Hyperboreus Mar 30 '14 at 02:01
  • @PyNEwbie That is true, but here we are facing an XY-problem par excellence. – Hyperboreus Mar 30 '14 at 02:07
  • What is a "binary integer"? Why are you interpreting digit strings in base 2 as if they were in base 10? I think you're getting the numbers and their representations mixed up, which is why you're going on this binary detour. – DSM Mar 30 '14 at 02:11
  • Sorry for the typo - indeed I expect 01000111. – user3284986 Mar 30 '14 at 02:28
  • @DSM - a binary integer is an integer expressed in base 2. The native data arrives as a hex string. You can't use a boolean XOR to calculate this kind of checksum. Hyperboreus had to use the bitwise operator as well in his very refined solution. – user3284986 Mar 30 '14 at 02:53
  • @PYNEwbie - Thanks - I did read that question prior and I tried using them, but being new to Python, it didn't appear to answer my question as my understanding of the format spec has me believing it only achieves string outputs. – user3284986 Mar 30 '14 at 03:07
  • 1
    @user3284986 I always find it practical to distinguish between the "representation" of a number and its "value". `0x2a`, `0b101010` and `42` all have the same value. But the value 42 can be represented as `0x2a`, `0b101010` or `42`. An integer is not binary, or decimal, or hexadecimal, ternary, unary or gray-coded: an integer is an integer, i.e. an element of Z. Its representation can be binary, decimal, etc, pp. – Hyperboreus Mar 30 '14 at 03:29
  • @user3284986: an integer isn't binary; it's just a number. In your line `wd1i = int(wd1bs)`, you interpret a binary *representation* of a number as if it were a decimal representation of a number: `int('10')` gives 10 (base-10), not 2, which isn't what you want. And this mistake was entirely unnecessary, because `^` acting on two integers will give you the xor you're looking for without any of these detours. – DSM Mar 30 '14 at 03:55
  • Thanks for all the help and advice. I will refrain from using the term "binary integer". I'm obviously new to Python - so in python and programming in general, what do you call a number that is an integer whose value is being represented in binary format within your program? – user3284986 Mar 30 '14 at 05:53

3 Answers3

5

Why use all this conversions and the costly string functions?

(I will answer the X part of your XY-Problem, not the Y part.)

def checksum (s):
    v = int (s, 16)
    checksum = 0
    while v:
        checksum ^= v & 0xff
        v >>= 8
    return checksum

cs = checksum ('010900004f')
print (cs, bin (cs), hex (cs) )

Result is 0x47 as expected. Btw 0x47 is 0b1000111 and not as stated 0b1001110.

Community
  • 1
  • 1
Hyperboreus
  • 31,997
  • 9
  • 47
  • 87
1

Just modify like this.

before:

wd1i = int(wd1bs)
wd2i = int(wd2bs)
wd3i = int(wd3bs)
wd4i = int(wd4bs)
wd5i = int(wd5bs)

after:

wd1i = int(wd1bs, 2)
wd2i = int(wd2bs, 2)
wd3i = int(wd3bs, 2)
wd4i = int(wd4bs, 2)
wd5i = int(wd5bs, 2)

Why your code doesn't work?

Because you are misunderstanding int(wd1bs) behavior. See doc here. So Python int function expect wd1bs is 10 base by default. But you expect int function to treat its argument as 2 base. So you need to write as int(wd1bs, 2)


Or you can also rewrite your entire code like this. So you don't need to use bin function in this case. And this code is basically same as @Hyperboreus answer. :)

w = int('010900004f', 16)
w1 = (0xff00000000 & w) >> 4*8
w2 = (0x00ff000000 & w) >> 3*8
w3 = (0x0000ff0000 & w) >> 2*8
w4 = (0x000000ff00 & w) >> 1*8
w5 = (0x00000000ff & w)

checksum = w1 ^ w2 ^ w3 ^ w4 ^ w5

print hex(checksum)
#'0x47'

And this is more shorter one.

import binascii
word = '010900004f'
print hex(reduce(lambda a, b: a ^ b, (ord(i) for i in binascii.unhexlify(word))))
#0x47
Kei Minagawa
  • 4,395
  • 3
  • 25
  • 43
  • All your lines `wx = (0x....` can be written as `wX = (w >> Y*8) & 0xff`. Just shift first and mask after, then it is always `0xff`. – Hyperboreus Mar 30 '14 at 03:22
  • @Hyperboreus: Ah this is more smarter way thanks. My code is always very verbose... :) – Kei Minagawa Mar 30 '14 at 03:37
  • @user3284986: I added explanation See what's wrong in your code. – Kei Minagawa Mar 30 '14 at 04:06
  • @user2931409 - Many thanks to you. You are spot on about my incorrect assumption regarding the base 10 default. I don't mind verbosity if it aids readability - which is what a python newbie like myself needs . . . . But eventually, we all want to type as little as possible :-) – user3284986 Mar 31 '14 at 23:54
1
s = '010900004f'
b = int(s, 16)
print reduce(lambda x, y: x ^ y, ((b>> 8*i)&0xff for i in range(0, len(s)/2)), 0)
Marat
  • 15,215
  • 2
  • 39
  • 48