16

I was trying to port a function from C to Python and to make it easy to debug, I'd prefer it performed the same CPU word-size limited operations so I could compare the intermediate results. In other words, I'd like something like:

a = UnsignedBoundedInt(32, 399999)
b = UnsignedBoundedInt(32, 399999)
print(a*b) # prints 1085410049 (159999200001 % 2**32)

What's the best way to achieve this so that all operations (including bitwise shifts) would work as in C?

d33tah
  • 10,999
  • 13
  • 68
  • 158

3 Answers3

24

You can try using ctypes.uint_32 to bound the results for you:

>>> import ctypes
>>> print ctypes.c_uint32(399999 * 399999).value
1085410049

Alternatively you can use numpy's data types:

>>> import numpy as np
>>> a = np.uint32(399999)
>>> b = np.uint32(399999)
>>> a * b
__main__:1: RuntimeWarning: overflow encountered in uint_scalars
1085410049
Claudiu
  • 224,032
  • 165
  • 485
  • 680
  • v1 -= ((v0 << 4 ^ v0 >> 5) + v0) ^ (sum + k[sum>>11 & 3]) TypeError: unsupported operand type(s) for <<: 'c_uint' and 'int' – d33tah Oct 07 '13 at 19:47
  • And: >>> type(numpy.uint32(3) << 4) – d33tah Oct 07 '13 at 19:54
  • TypeError: unsupported operand type(s) for <<: 'c_uint' and 'c_uint' – d33tah Oct 07 '13 at 19:57
  • So, looks like it's not exactly the best solution. – d33tah Oct 07 '13 at 19:57
  • @d33tah You forgot about the `.value` in the `ctypes` case. Besides, you should put this "clipping operation" one level towards the outside. That is, if you have `clip_u32 = lambda val: ctypes.c_uint32(val).value`, you could do `v1 -= clip_u32((v0 << 4 ^ v0 >> 5) + v0) ^ clip_u32(sum + k[sum>>11 & 3])`. – glglgl Oct 07 '13 at 20:36
  • @d33tah: With the `ctypes` approach you have to always convert to py ints and convert back for each op. Not the nicest approach, true. `numpy` does more what you want. Alternatively you can subclass int yourself and override all the ops to do the ctypes conversion & back – Claudiu Oct 07 '13 at 22:35
3

Here's an interesting solution, though it only works under Python 2:

class U32:
    """Emulates 32-bit unsigned int known from C programming language."""

    def __init__(self, num=0, base=None):
        """Creates the U32 object.

        Args:
            num: the integer/string to use as the initial state
            base: the base of the integer use if the num given was a string
        """
        if base is None:
            self.int_ = int(num) % 2**32
        else:
            self.int_ = int(num, base) % 2**32

    def __coerce__(self, ignored):
        return None

    def __str__(self):
        return "<U32 instance at 0x%x, int=%d>" % (id(self), self.int_)

    def __getattr__(self, attribute_name):
        print("getattr called, attribute_name=%s" % attribute_name)
        # you might want to take a look here:
        # https://stackoverflow.com/q/19611001/1091116
        r = getattr(self.int_, attribute_name)
        if callable(r):  # return a wrapper if integer's function was requested
            def f(*args, **kwargs):
                if args and isinstance(args[0], U32):
                    args = (args[0].int_, ) + args[1:]
                ret = r(*args, **kwargs)
                if ret is NotImplemented:
                    return ret
                if attribute_name in ['__str__', '__repr__', '__index__']:
                    return ret
                ret %= 2**32
                return U32(ret)
            return f
        return r

print(U32(4) / 2)
print(4 / U32(2))
print(U32(4) / U32(2))

For Python 3 compatibility, have a look here.

Community
  • 1
  • 1
d33tah
  • 10,999
  • 13
  • 68
  • 158
1

Acheiving this using the arbitrary precision integers in Python

Certain math operations are safe to chain where you can delay computing the modulus until later. Examples of this is +, -, *, << (right shift).

((a%c) + (b%c)) % c == (a + b) % c
((a%c) - (b%c)) % c == (a - b) % c
((a%c) * (b%c)) % c == (a * b) % c
((a%c) << (b%c)) % c == (a << b) % c

Others are not, like /, //, % (modulo) and >> (left shift).

Also note that when our modulo is a power of 2, we can replace the operation with a bitwise AND operation - which essentially zeros out the binary digits you don't want.

x % (2**n) == x & (2**n - 1)

So for 32 bit operations, that becomes:

x % (2**32) == x & 0xFFFFFFFF

That means in order to match the result of u32-operations in C/numpy, you could perform x & 0xFFFFFFFF for every intermediate result of your computation. You can temporarily skip this for certain operations as highlighted above.

This doesn't require any dependencies, and is less verbose than using ctypes.c_uint32(...).value for every intermediate result.


When p == 2**n, certain bitwise operations also make it safe to delay the modulo operation.

((a%p) & (b%p)) % p == (a & b) % p  # BITWISE AND
((a%p) | (b%p)) % p == (a | b) % p  # BITWISE OR
((a%p) ^ (b%p)) % p == (a ^ b) % p  # BITWISE XOR
Tobias Bergkvist
  • 1,751
  • 16
  • 20