Python: Faster Universal Hashing function with built in libs

Question

I am trying to implement the universal hashing function with only base libs:

I am having issues because I am unable to run this in an effective time. I know % is slow so I have tried the following:

((a * x + b) % P) % n

divmod(divmod(a * x + b, P)[1], n)[1]

subeq = pow(a * x + b, 1, P)
hash = pow(subeq, 1, self.n)

All of these function are too slow for what I am trying to do. Is there a faster way to do mod division only using the base libs that I am unaware of?

Edit To elaborate, I will be running this function about 200000 times (or more) and I need for all 200000 runs to complete in under 4 seconds. None of these methods are even in that ball park (taking minutes)

How about using numpy? You could create a vector of x (and maybe a or b if necessary) and run a few 10000 (or more) calculations as vector arithmethic at once. — Michael Butscher, Sep 29 '17 at 03:13
How big are your numbers? Do you need arbitrary precision, or will they all fit in 32 bit or 64 bit machine integers? FWIW, Python function calls are relatively slow - they have more overhead than simple C function calls. Many standard functions are coded in C, so they're faster than calls to code implemented in Python, but even so, code that calls functions is generally slower than equivalent code that just uses operators. IOW, `pow` and `divmod` aren't faster than using `**` or `%`. — PM 2Ring, Sep 29 '17 at 03:30
Millions is fine. Signed 32 bit integers can handle `-2147483648 <= x <= 2147483647`. It's a shame you can't use Numpy. I haven't done any tests yet, but I suspect it will give you the needed speed if you can process all your x values in one go using a Numpy array. — PM 2Ring, Sep 29 '17 at 03:36
I don't understand why your code is taking minutes. With `a, b = 1103515245, 12345; p = (1<<31) - 1; m = 1000000` I can hash 200000 random numbers in `range(2147483647)` in around 0.4 seconds, on my old 2GHz 32 bit machine. That's using `((a*x + b) % p) % m`; if I put that code into a function, it's around 0.1 seconds slower. — PM 2Ring, Sep 29 '17 at 03:54
FWIW, using Numpy I can hash an array of 200000 numbers in that range in under 0.05 seconds (using signed 64 bit arithmetic). — PM 2Ring, Sep 29 '17 at 04:08
I am implementing a filter. For each item in the filter, I will add up to j hash values, then store them in an array. then, to search the filter, I will recalculate those hash values and see if they are contained within the filter array. So really, this is being run 2(200000*k) times where k can be around 10 - 20 — Soatl, Sep 29 '17 at 04:54
How many different values are used for p and m? If there aren't many, maybe some sort of lookup-table (as `list` or `array.array`) might help, but this is not sure, sometimes such tables are even slower. — Michael Butscher, Sep 29 '17 at 11:23
Ok, that sounds like a [Bloom filter](https://en.wikipedia.org/wiki/Bloom_filter). Is there some reason you can just use a set for this? It's bound to be faster than a Bloom filter implemented in pure Python. But anyway, I can't get the speed you want with 20 hash functions (which gives a false positive probability of around `10**-6`) and 200k data items (in `range(2147483647)`, as in my previous tests). — PM 2Ring, Sep 29 '17 at 14:40
Using an `array.array` of unsigned bytes for the filter, it takes around 22 seconds to add the items to the filter array, and 16 seconds to test that all those items are present. Testing 10000000 randomly generated items that aren't in the original data set takes around 200 seconds (that includes the time needed to test and reject items that are in the original data set using standard Python set operations). I can post my code as an answer, if you're interested. — PM 2Ring, Sep 29 '17 at 14:41
The issue was a silly mistake in how I declare P and n. Seems good now that I do not find the primes over and over again even though they never change — Soatl, Sep 29 '17 at 20:14

score 2 · Accepted Answer · answered Sep 29 '17 at 03:09

You're not going to do better than ((a * x + b) % P) % m in pure Python code; the overhead of the Python interpreter is going to bottleneck you more than anything else; yes, if you ensure the m is a power of two, you can precompute mm1 = m - 1 and change the computation to ((a * x + b) % P) & mm1, replacing a more expensive remaindering operation with a cheaper bitmasking operation, but unless P is huge (hundreds of bits minimum), the interpreter overhead will likely outweigh the differences between remainder and bitmasking.

If you really need the performance, and the types you're working with will fit in C level primitive type, you may benefit from writing a Python C extension that converts all the values to size_t, Py_hash_t, uint64_t, or whatever suits your problem and performs the math as a set of bulk conversions to C types, C level math, then a single conversion back to Python int, saving a bunch of byte code and intermediate values (that are promptly tossed).

If the values are too large to fit in C primitives, GMP types are an option (look at mpz_import and mpz_export for efficient conversions from PyLong to mpz_t and back), but the odds of seeing big savings go down; GMP does math faster in general, and can mutate numbers in place rather than creating and destroying lots of temporaries, but even with mpz_import and mpz_export, the cost of converting between Python and GMP types would likely eat most of the savings.

score 2 · Answer 2 · answered Oct 07 '20 at 21:12

from math import ceil, log2
from primesieve import nth_prime #will get nth prime number [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37]
from random import randint

class UniversalHashing:
  """ N = #bins
      p = prime number st: p >= N 
    nth_prime(1, 1 << max(32, ceil(log2(N))))
    nth_prime(1,1<<max(32),ceil(log2(2)))))
    nth_prime(1,2**32)
    nth_prime(1,4294967296)
    =4294967311   
    assert:- Returns Error if condition not satisfied 
    << operatior:-  multiply with 2the power like 2<<2 =2*2'2=8 or 7*2'3=56 and ceil will give the exact value or next vlue ceil(1)=1 , ceil(1.1)=2
    randint:- Return a random integer N such that a <= N <= b. Alias for randrange(a, b+1). """
  def __init__(self, N, p = None):
    self.N = N
    if p is None:
      p = nth_prime(1, 1 << max(32, ceil(log2(N))))  
    assert p >= N, 'Prime number p should be at least N!'
    self.p = p

  def draw(self):
    a = randint(1, self.p - 1)
    b = randint(0, self.p - 1)
    return lambda x: ((a * x + b) % self.p) % self.N

if __name__ == '__main__':
  N = 50       #bins
  n = 100000   #elements
  H = UniversalHashing(N)
  h = H.draw()

  T = [0] * N
  for _ in range(n):
    x = randint(0, n * 10)
    T[h(x)] += 1

  for i in range(len(T)):
    print(T[i] / n)    # This should be approximately equal

Python: Faster Universal Hashing function with built in libs

2 Answers2

Linked