3

The goal is to be able to do:

struct.pack('!H', x)

But if x is greater than 65535 then this will fail for obvious reasons.
I'm not a wizard with bit manipulation but I've understood that << operations will loose any shifted bits. But I have no idea how to extract one or more carrying bits from the binary/bytes string and add the carry value to the set of two bytes trailing the carrying bytes.

And I need to traverse the bytes string by 2 bytes each time and sum them up against the previous two bytes. Occasionally this will generate a value greater than 65535 - this is where I need to extract the carry bit from the result - and perform a + operation on the result with the carry bit itself (see picture below).

This is what I'm trying to accomplish: (In this case the carrying bit were only one, and the 2 trailing bytes would get +1 as a result.) enter image description here

And this is what I've got so far:

from struct import unpack, pack
from binascii import *

even_bytes_string = b'\x45\x00\x00\x3c\x1c\x46\x40\x00\x40\x06\xac\x10\x0a\x63\x‌​ac\x10\x0a\x0c'
result = None
for i in range(0, len(even_bytes_string)-1,2):
    if not result:
        result = unpack('!H', even_bytes_string[i:i+2])[0]
        continue

    result += unpack('!H', even_bytes_string[i:i+2])[0]
    if result > 65535:
        # Got a carry bit.
        pass

    print(result)
    print(pack('!H', result))

I literally have no idea how to do this quite simple task, without converting the result of a add operation into actual binary representation in string format (11001...). And then do string manipulation s = s[-16:]+s[:-16] (oversimplified), finally converting it back to a set of 2 bytes. It doesn't seam practical, fast or very "correct".

I'm hoping one of you know your way around bit manipulation in Python that could tell me what the proper way of doing this is. There has to be some.

A slightly more confusing picture of what I'm trying to accomplish (of keeping result at 2 bytes, cutting away any carrying bits and adding them to result as a "separate" value.):enter image description here

Torxed
  • 22,866
  • 14
  • 82
  • 131
  • what do you expect if the last bit is 1 and not 0? – Tryph May 05 '17 at 16:23
  • @Tryph I just realized I'm simply supposed to add the carrying bits (might be multiple) to the two sets of bytes. So my question needs an update. – Torxed May 05 '17 at 16:29
  • Maybe I could use `carry = struct.pack('!I', result)[:2]`? Again, seams kind of impractical but would be better than converting to a binary representation and doing string manipulation! – Torxed May 05 '17 at 16:35
  • Something in your code reminds me of UTF-8 encoding algorithm. – GIZ May 05 '17 at 16:37
  • Am finding it very hard to follow the description of the desired operation or how it is motivated by the very simple goal of performing `struct.pack('!H',x)`. Perhaps give multi-bytes/multi-word example inputs and outputs? – jez May 05 '17 at 16:41
  • In Python the `<<` operator does /not/ loose any shifted bits, as numbers in Python are not of fixed size. – Torkel Bjørnson-Langen May 05 '17 at 16:42
  • @jez, it's for calculating a network header. http://www.thegeekstuff.com/2012/05/ip-header-checksum/ - check the "Now lets add these binary values one by one" section. – Torxed May 05 '17 at 16:42
  • @TorkelBjørnson-Langen Did i missunderstand this SO post then: http://stackoverflow.com/questions/141525/what-are-bitwise-shift-bit-shift-operators-and-how-do-they-work `Please note that these are not circular shifts. The digit that gets shifted "off the end" is lost. It does not wrap around.` – Torxed May 05 '17 at 16:46
  • 1
    @Torxed First add the two numbers in Python (integer precision will expand to fit as necessary). Then, on the page you cite it looks like the examples are doing no more than `(x & 0xFF) + (x >> 16)`. – jez May 05 '17 at 16:53
  • @jez The string in the cited link would be `b'\x45\x00\x00\x3c\x1c\x46\x40\x00\x40\x06\xac\x10\x0a\x63\xac\x10\x0a\x0c'` where in your example `(x & 0xFF)` I'm assuming would be the value without the carrying bit? Which results in `153` when the first carry bit occurs in the string above, and `(result >> 16)` gets the carrying bit by the looks of it since it results in `1`, that works. But the expected value when a carry bit occurs should be `36248` I *think* (note: on the first occurrence of a carry bit.). – Torxed May 05 '17 at 17:05
  • @Torxed sorry I meant `0xFFFF` (see answer) – jez May 05 '17 at 17:26
  • @Torxed Yes, probably. That post does not mention Python. – Torkel Bjørnson-Langen May 05 '17 at 19:48

2 Answers2

2

The process is just (x & 0xFFFF) + (x >> 16). The << operation does not need to be involved. Here's an implementation of the worked example you cite:

def addwrap16( a, b ):
    c = a + b
    w = ( c & 0xFFFF ) + ( c >> 16 )
    print(' {:04x} ->  {:016b}'.format(a, a))
    print(' {:04x} ->  {:016b}'.format(b, b))
    print('{:05x} -> {:017b}'.format(c, c))
    print(' {:04x} ->  {:016b}'.format(w, w))
    print('')
    return w

import struct, functools
even_bytes_string = b'\x45\x00\x00\x3c\x1c\x46\x40\x00\x40\x06\xac\x10\x0a\x63\xac\x10\x0a\x0c'
vals = struct.unpack( '!' + 'H' * ( len( even_bytes_string ) // 2 ), even_bytes_string )
result = functools.reduce(addwrap16, vals)

which spits out the following:

 4500 ->  0100010100000000
 003c ->  0000000000111100
0453c -> 00100010100111100
 453c ->  0100010100111100

 453c ->  0100010100111100
 1c46 ->  0001110001000110
06182 -> 00110000110000010
 6182 ->  0110000110000010

 6182 ->  0110000110000010
 4000 ->  0100000000000000
0a182 -> 01010000110000010
 a182 ->  1010000110000010

 a182 ->  1010000110000010
 4006 ->  0100000000000110
0e188 -> 01110000110001000
 e188 ->  1110000110001000

 e188 ->  1110000110001000
 ac10 ->  1010110000010000
18d98 -> 11000110110011000
 8d99 ->  1000110110011001

 8d99 ->  1000110110011001
 0a63 ->  0000101001100011
097fc -> 01001011111111100
 97fc ->  1001011111111100

 97fc ->  1001011111111100
 ac10 ->  1010110000010000
1440c -> 10100010000001100
 440d ->  0100010000001101

 440d ->  0100010000001101
 0a0c ->  0000101000001100
04e19 -> 00100111000011001
 4e19 ->  0100111000011001
jez
  • 14,867
  • 5
  • 37
  • 64
  • The `<<` was just an example of bitshifting I was hoping could come in handy, but quickly found out it didn't. Neither would `>>` or `|` operations. This appears to do the trick! :) – Torxed May 05 '17 at 17:35
  • Manually following the values i expected by doing math in my head, using my solution and yours - they all come to the same conclusion. And it appears to be a working network checksum header when looking in tcpdump, so I'm more than pleased with this more efficient solution! Big thanks for the answer, I even see the logic in why this is working :) – Torxed May 05 '17 at 18:10
0

Ok so this is not the most elegant solution, and not sure it's actually accurate but it doesn't break struct.pack() at least.

Based on a simple question by @Tryph, this is what I came up with as a work-around:

from struct import unpack, pack
from binascii import *

even_bytes_string = b'\x45\x00\x00\x3c\x1c\x46\x40\x00\x40\x06\xac\x10\x0a\x63\x‌​ac\x10\x0a\x0c'
result = None
for i in range(0, len(even_bytes_string)-1,2):
    if not result:
        result = unpack('!H', even_bytes_string[i:i+2])[0]
        continue

    result += unpack('!H', even_bytes_string[i:i+2])[0]
    if result > 65535:
        tmp = pack('!I', result)
        carry = unpack('!H', tmp[:2])[0]
        result = unpack('!H', tmp[2:])[0]+carry

    print(result)
    print(pack('!H', result))

Simply treats the bigger number as a Int instead of Short, this enables me to get the two prepending bytes as carry, and add those ontop of the trailing two bytes. It's not elegant, but probably works?

Torxed
  • 22,866
  • 14
  • 82
  • 131