lru_cache works wonders in your case. make sure maxsize
is a power of 2. may need to fiddle a bit with that size for your application. cache_info()
will help with that.
also use //
instead of /
for integer division.
from functools import lru_cache
@lru_cache(maxsize=512, typed=False)
def fusc(n):
if n <= 1:
return n
while n > 2 and n % 2 == 0:
n //= 2
return fusc((n - 1) // 2) + fusc((n + 1) // 2)
print(fusc(1000000000078093254329870980000043298))
print(fusc.cache_info())
and yes, this is just meomization as proposed by Filip Malczak.
you might gain an additional tiny speedup using bit-operations in the while loop:
while not n & 1: # as long as the lowest bit is not 1
n >>= 1 # shift n right by one
UPDATE:
here is a simple way of doing meomzation 'by hand':
def fusc(n, _mem={}): # _mem will be the cache of the values
# that have been calculated before
if n in _mem: # if we know that one: just return the value
return _mem[n]
if n <= 1:
return n
while not n & 1:
n >>= 1
if n == 1:
return 1
ret = fusc((n - 1) // 2) + fusc((n + 1) // 2)
_mem[n] = ret # store the value for next time
return ret
UPDATE
after reading a short article by dijkstra himself a minor update.
the article states, that f(n) = f(m)
if the fist and last bit of m
are the same as those of n
and the bits in between are inverted. the idea is to get n
as small as possible.
that is what the bitmask (1<<n.bit_length()-1)-2
is for (first and last bits are 0
; those in the middle 1
; xor
ing n
with that gives m
as described above).
i was only able to do small benchmarks; i'm interested if this is any help at all for the magitude of your input... this will reduce the memory for the cache and hopefully bring some speedup.
def fusc_ed(n, _mem={}):
if n <= 1:
return n
while not n & 1:
n >>= 1
if n == 1:
return 1
# https://www.cs.utexas.edu/users/EWD/transcriptions/EWD05xx/EWD578.html
# bit invert the middle bits and check if this is smaller than n
m = n ^ (1<<n.bit_length()-1)-2
n = m if m < n else n
if n in _mem:
return _mem[n]
ret = fusc(n >> 1) + fusc((n >> 1) + 1)
_mem[n] = ret
return ret
i had to increase the recursion limit:
import sys
sys.setrecursionlimit(10000) # default limit was 1000
benchmarking gave strange results; using the code below and making sure that i always started a fresh interperter (having an empty _mem
) i sometimes got significantly better runtimes; on other occasions the new code was slower...
benchmarking code:
print(n.bit_length())
ti = timeit('fusc(n)', setup='from __main__ import fusc, n', number=1)
print(ti)
ti = timeit('fusc_ed(n)', setup='from __main__ import fusc_ed, n', number=1)
print(ti)
and these are three random results i got:
6959
24.117448464001427
0.013900151001507766
6989
23.92404893300045
0.013844672999766772
7038
24.33894686200074
24.685758719999285
that is where i stopped...