1

Stern's Diatomic Sequence can be read about in more details over here; however, for my purpose I will define it now.


Definition of Stern's Diatomic Sequence

Let n be a number to generate the fusc function out of. Denoted fusc(n).

If n is 0 then the returned value is 0.
If n is 1 then the returned value is 1.

If n is even then the returned value is fusc(n / 2).
If n is odd then the returned value is fusc((n - 1) / 2) + fusc((n + 1) / 2).


Currently, my Python code brute forces through most of the generation, other than the dividing by two part since it will always yield no change.

def fusc (n):
    if n <= 1:
        return n

    while n > 2 and n % 2 == 0:
        n /= 2

    return fusc((n - 1) / 2) + fusc((n + 1) / 2)

However, my code must be able to handle digits in the magnitude of 1000s millions of bits, and recursively running through the function thousands millions of times does not seem very efficient or practical.

Is there any way I could algorithmically improve my code such that massive numbers can be passed through without having to recursively call the function so many times?

Ziyad Edher
  • 2,150
  • 18
  • 31
  • Read about memoization: https://en.wikipedia.org/wiki/Memoization It can be wonderfuly done with python, but right now I am not sure whether it is a good solution here, as you won't calculate same thing too many times... – Filip Malczak Dec 19 '16 at 18:57
  • Memoization might help slightly once descending into smaller numbers, but I need to be able to compute large numbers off the bat incredibly quickly, although implementing something like this would definitely improve long-time performance, it is not really what I am looking for. – Ziyad Edher Dec 19 '16 at 18:59
  • That's why it became a comment instead of an answer. Good luck. – Filip Malczak Dec 19 '16 at 19:01

2 Answers2

9

With memoization for a million bits, the recursion stack would be extremely large. We can first try to look at a sufficiently large number which we can work by hand, fusc(71) in this case:

fusc(71) = fusc(35) + fusc(36)

fusc(35) = fusc(17) + fusc(18)
fusc(36) = fusc(18)

fusc(71) = 1 * fusc(17) + 2 * fusc(18)

fusc(17) = fusc(8) + fusc(9)
fusc(18) = fusc(9)

fusc(71) = 1 * fusc(8) + 3 * fusc(9)

fusc(8) = fusc(4)
fusc(9) = fusc(4) + fusc(5)

fusc(71) = 4 * fusc(4) + 3 * fusc(5)

fusc(4) = fusc(2)
fusc(3) = fusc(1) + fusc(2)

fusc(71) = 7 * fusc(2) + 3 * fusc(3)

fusc(2) = fusc(1)
fusc(3) = fusc(1) + fusc(2)

fusc(71) = 11 * fusc(1) + 3 * fusc(2)

fusc(2) = fusc(1)

fusc(71) = 14 * fusc(1) = 14

We realize that we can avoid recursion completely in this case as we can always express fusc(n) in the form a * fusc(m) + b * fusc(m+1) while reducing the value of m to 0. From the example above, you may find the following pattern:

if m is odd:
a * fusc(m) + b * fusc(m+1) = a * fusc((m-1)/2) + (b+a) * fusc((m+1)/2)
if m is even:
a * fusc(m) + b * fusc(m+1) = (a+b) * fusc(m/2) + b * fusc((m/2)+1)

Therefore, you may use a simple loop function to solve the problem in O(lg(n)) time

def fusc(n):
  if n == 0: return 0
  a = 1
  b = 0
  while n > 0:
    if n%2:
      b = b + a
      n = (n-1)/2
    else:
      a = a + b
      n = n/2
  return b
Community
  • 1
  • 1
lhhong
  • 116
  • 1
  • 5
4

lru_cache works wonders in your case. make sure maxsize is a power of 2. may need to fiddle a bit with that size for your application. cache_info() will help with that.

also use // instead of / for integer division.

from functools import lru_cache


@lru_cache(maxsize=512, typed=False)
def fusc(n):

    if n <= 1:
        return n

    while n > 2 and n % 2 == 0:
        n //= 2

    return fusc((n - 1) // 2) + fusc((n + 1) // 2)

print(fusc(1000000000078093254329870980000043298))
print(fusc.cache_info())

and yes, this is just meomization as proposed by Filip Malczak.

you might gain an additional tiny speedup using bit-operations in the while loop:

while not n & 1:  # as long as the lowest bit is not 1
    n >>= 1       # shift n right by one

UPDATE:

here is a simple way of doing meomzation 'by hand':

def fusc(n, _mem={}):  # _mem will be the cache of the values 
                       # that have been calculated before
    if n in _mem:      # if we know that one: just return the value
        return _mem[n]

    if n <= 1:
        return n

    while not n & 1:
        n >>= 1
    if n == 1:
        return 1    

    ret = fusc((n - 1) // 2) + fusc((n + 1) // 2)
    _mem[n] = ret  # store the value for next time
    return ret

UPDATE

after reading a short article by dijkstra himself a minor update.

the article states, that f(n) = f(m) if the fist and last bit of m are the same as those of n and the bits in between are inverted. the idea is to get n as small as possible.

that is what the bitmask (1<<n.bit_length()-1)-2 is for (first and last bits are 0; those in the middle 1; xoring n with that gives m as described above).

i was only able to do small benchmarks; i'm interested if this is any help at all for the magitude of your input... this will reduce the memory for the cache and hopefully bring some speedup.

def fusc_ed(n, _mem={}):

    if n <= 1:
        return n

    while not n & 1:
        n >>= 1
    if n == 1:
        return 1

    # https://www.cs.utexas.edu/users/EWD/transcriptions/EWD05xx/EWD578.html
    # bit invert the middle bits and check if this is smaller than n
    m = n ^ (1<<n.bit_length()-1)-2
    n = m if m < n else n

    if n in _mem:
        return _mem[n]

    ret = fusc(n >> 1) + fusc((n >> 1) + 1)
    _mem[n] = ret
    return ret

i had to increase the recursion limit:

import sys
sys.setrecursionlimit(10000)  # default limit was 1000

benchmarking gave strange results; using the code below and making sure that i always started a fresh interperter (having an empty _mem) i sometimes got significantly better runtimes; on other occasions the new code was slower...

benchmarking code:

print(n.bit_length())

ti = timeit('fusc(n)', setup='from __main__ import fusc, n', number=1)
print(ti)

ti = timeit('fusc_ed(n)', setup='from __main__ import fusc_ed, n', number=1)
print(ti)

and these are three random results i got:

6959
24.117448464001427
0.013900151001507766

6989
23.92404893300045
0.013844672999766772

7038
24.33894686200074
24.685758719999285

that is where i stopped...

hiro protagonist
  • 44,693
  • 14
  • 86
  • 111
  • Unfortunately, I am using Python2.7, for which `lru_cache` is not available. I should have added Python2.7 in the tags. – Ziyad Edher Dec 19 '16 at 19:15
  • http://stackoverflow.com/a/11861795/4954037 might help there... or you could roll your own. – hiro protagonist Dec 19 '16 at 19:18
  • Cannot use any external libraries either. I think the solution needs to be nearly completely algorithmic. – Ziyad Edher Dec 19 '16 at 19:22
  • added a meomized version without using `lru_cache`. please try it before discarding the idea of caching values. the speedup is enormous! – hiro protagonist Dec 19 '16 at 19:24
  • Your implementation greatly increased the speed of my application; however, I incorrectly reported that my numbers will be in the thousands of bits, while in fact they turned out to be in the millions. To process just one test case I am needing a great deal over 12 seconds, which is the time limit of the processing. – Ziyad Edher Dec 19 '16 at 19:31
  • @Wintro: a small update... but don't get your hopes to high... sorry. – hiro protagonist Dec 20 '16 at 12:25
  • Wow, thank you! I noticed that article but felt like it was beyond the scope of what I was trying to achieve. It _dramatically_ improved the speed when calculating massive digits. For a million-bit digit, it reduced run time from over 30 seconds to just 4 seconds. – Ziyad Edher Dec 20 '16 at 13:56
  • 1
    I think https://stackoverflow.com/a/44557597/1400793 should be the accepted answer. The code in the other answer contains no recursion, is clearly O(log n), and greatly simpler. – Paul Hankin Apr 24 '18 at 20:39
  • @PaulHankin i agree! this is a great answer! (did not see it before as it was months after mine...). – hiro protagonist Apr 25 '18 at 06:23