I have derived the Tetranacci doubling formulas from the corresponding matrix as described in the other answers. The formulas are:
T(2n) = T(n+1)*(2*T(n+2) - T(n+1)) + T(n)*(2*T(n+3) - 2*T(n+2) - 2*T(n+1) - T(n))
T(2n+1) = T(n)^2 + T(n+2)^2 + T(n+1)*(2*T(n+3) - 2*T(n+2) - T(n+1))
T(2n+2) = T(n+1)*(2*T(n) + T(n+1)) + T(n+2)*(2*T(n+3) - T(n+2))
T(2n+3) = T(n+1)^2 + T(n+3)^2 + T(n+2)*(2*T(n) + 2*T(n+1) + T(n+2))
With these, we can implement the "fast doubling" method. Here's one such implementation in Python, whose native support for arbitrary-sized integers is very convenient:
def tetranacci_by_doubling(n):
if n >= 0:
a, b, c, d = 0, 0, 0, 1 # T(0), T(1), T(2), T(3)
else: # n < 0
a, b, c, d = 1, 0, 0, 0 # T(-1), T(0), T(1), T(2)
# unroll the last iteration to avoid computing unnecessary values.
for i in reversed(range(1, abs(n).bit_length())):
w = b*(2*c - b) + a*(2*(d - c - b) - a)
x = a*a + c*c + b*(2*(d - c) - b)
y = b*(2*a + b) + c*(2*d - c)
z = b*b + d*d + c*(2*(a + b) + c)
a, b, c, d = w, x, y, z
if (n >> i) & 1 == 1:
a, b, c, d = b, c, d, a + b + c + d
if n & 1 == 0:
return b*(2*c - b) + a*(2*(d - c - b) - a) # w
else: # n & 1 == 1
return a*a + c*c + b*(2*(d - c) - b) # x
def tetranacci(n):
a, b, c, d = 0, 0, 0, 1 # T(0), T(1), T(2), T(3)
# offset by 3 to reduce excess computation for large positive `n`
n -= 3
if n >= 0:
for _ in range(+n):
a, b, c, d = b, c, d, a + b + c + d
else: # n < 0
for _ in range(-n):
a, b, c, d = d - c - b - a, a, b, c
return d
# sanity check
print(all(tetranacci_by_doubling(n) == tetranacci(n) for n in range(-1000, 1001)))
I would've liked to adjust the doubling formulas to be T(2n-3),T(2n-2),T(2n-1),T(2n)
in terms of T(n-3),T(n-2),T(n-1),T(n)
to slightly reduce excess computation for large n
, but simplifying the shifted formulas is tedious.
Update
- Swapped to an iterative version since I figured out how to make it cleanly handle negative
n
with minimal duplication. Originally, this was the sole advantage of the recursive version.
- Incorporated a technique that's described in several papers about computing Fibonacci & Lucas numbers--which is to perform the final doubling step manually after the loop to avoid computing extra unneeded values. This results in about ~40%-50% speed-up for large
n
(>= 10^6)
! This optimization could also be applied to the recursive version, as well.
The speed-up due to the unrolling of the last iteration is pretty interesting. It suggests that nearly half of the computational work is done in the final step. This kind of makes sense, since the number of digits in T(n)
(and therefore the cost of arithmetic) approximately doubles when n
doubles, and we know that 2^n ~= 2^0 + 2^1 + ... + 2^(n-1)
. Applying the optimization to similar Fibonacci/Lucas doubling algorithms produces a similar speed-up of ~40%--although, if you're computing Fibonacci/etc. modulo some 64-bit M
, I suspect this optimization isn't as valuable.