4

I need to count the number of trailing and leading zeros in a numpy uint64 variable, so right now I'm doing it like this:

# n > 0
n = np.uint64(100)
s = np.binary_repr(n)
trail_zeros = len(s) - len(s.rstrip('0'))
lead_zeros = 64 - len(s)

Is there a better way of doing this, without using strings? The priority is speed. Thank you!

Fernando
  • 7,785
  • 6
  • 49
  • 81

3 Answers3

2

For numbers in [0,2**63), we can use some arithmetic operations to get the leading and trailing zeros in their binary formats and hence skip the string manipulations -

def get_leading_trailing_zeros(n):
    a = (2**np.arange(64) & n)
    lead_zeros = 64-a.argmax()-1
    if n==0:
        trail_zeros = 1
    else:
        trail_zeros = (a==0).argmin()
    return lead_zeros,trail_zeros
Divakar
  • 218,885
  • 19
  • 262
  • 358
1

I am not sure about the speed of the following code. But you can certainly do it like this, without using strings.

n = np.uint64(100)
i=1
while((n>>np.uint64(i))%2==0):
    i+=1

trail_zeros=i

You right shift the value n until you get an odd number. Number of right shifts done is equal to trail_zeros.

Achintha Ihalage
  • 2,310
  • 4
  • 20
  • 33
  • 1
    I think this answer is easier to vectorize and has lower memory foot print and better performance than the accepted answer of @Divakar. – gdlmx Mar 06 '19 at 13:58
1
lead_zeros = int(64-np.ceil(np.log2(n)))

Because len(s) is equal to ceil(log2(n)). This is pure arithmetic operation so it can be perfectly vectorized by numpy and much faster than writing your own loop.

Performance

performance_lead_zeros

gdlmx
  • 6,479
  • 1
  • 21
  • 39