4

The question is to find number of valid sequences (A1,A2....AN) with given constraints:

  1. Value of Ai lies between 1 and 6 (1<= Ai <= 6)
  2. Two adjacent numbers should be coprime. For e.g. 2,4,.. sequence is not allowed
  3. If the value repeats in the sequence, then difference in their index should be greater than 2. For e.g. If K=4, (1,2,3,1) is a valid sequence while (1,3,4,3) is not a valid sequence
    Note: N is given in the problem statement.

I could only think of a backtracking solution wherein each recursive call would try all numbers from 1 to 6 for the given Ai position.
This look like more of brute force way. Could you please help with an optimal solution.

Tarun
  • 3,162
  • 3
  • 29
  • 45
  • 1
    `difference in their index should be greater than or equal to 2` according to the example, `or equal` is not correct – MBo Apr 07 '22 at 06:35
  • @MBo Thanks for pointing it out. Yes it should be greater than 2. – Tarun Apr 07 '22 at 06:48

4 Answers4

3

What you have here is a graph search problem which can be solved using backtracking search and can be made to run even faster using memoization.

Defining the states

Following the rules in the question, the states are tuple where the first value in the tuple are the current number a_n and the previous number in the series a_{n-1} and the second value in the tuple is the length of the sequence, since the occurance of a number can be in an interval of 2 or more.

In addition, the forbidden states are states where the two numbers are not co-prime, this means that every permutation of the

number_set = [1, 2, 3, 4, 5, 6]

with the length of 2 is a valid state except the forbidden set which is:

forbidden_tuples = {(2,4), (2,6), (4,6), (3,6), (4,2), (6,2), (6,4), (6,3)}

Example: ((2,3), 5) is in state (2,3) with a needed sequence length of 5. In this case the following number can not be 2,3 (to keep distance of at least 2) or 6 (to keep adjacent numbers co-prime) so a total of thee subsequent states:

  1. ((3,1), 4)
  2. ((3,4), 4)
  3. ((3,5), 4)

Building the solution

The solution I will offer is a recursive DFE of the graph with memoization for time optimization. the solution is as follows:

import itertools

def count_seq(N, start_state=None, memoization=None):
    """
    :param N:
    :param start_state
    :param memoization
    :return: Rules:
    1. a_i in {1,2,3,4,5,6}
    2. Two adjacent numbers should be coprime. For e.g. 2,4,.. sequence is not allowed ( for the set {1,2,3,4,5,6} this means that (2,6),(2,4),(3,6),(4,6) are forbidden
    3. repetitions with a distance >= 2

    State are ordered as a 2 item tuple to remember the history: (a_{n-1} , a_n)
    """
    if N == 1:
        return 6
    if memoization is None:
        memoization = {}
    count = 0
    state_set = set()  # Pending states which have not been explored yet
    number_set = [1, 2, 3, 4, 5, 6]
    forbidden_tuples = {(2,4), (2,6), (4,6), (3,6), (4,2), (6,2), (6,4), (6,3)}
    if start_state is None: # Getting initial states
        for subset in itertools.permutations(number_set, 2):
            if subset in forbidden_tuples:
                pass
            else:
                state_set.add((subset, N))
    else:  # checking all possible next states and appending to queue with 1 less length
        for ii in number_set:
            if ii in start_state or (ii, start_state[-1]) in forbidden_tuples:
                pass
            else:
                state_set.add(((start_state[1:] + tuple([ii])), N-1))
    # for each possible next state
    for state in state_set:
        if state[1] <= 2:
            count += 1
        elif state in memoization:  # checking if we computed this already
            count += memoization[state]
        else:  #we did not compute this, doing the computation
            memoization[state] = count_seq(state[1], state[0], memoization)
            count +=  memoization[state]

    return count

Explenation

If the wanted length is 1, reutn 6 since either of the numbers can be in the 1st location.

  1. We see if the start state is None we assume that this is the initial calling, and so we create all the possible none forbidden permutations of the numbers with length 2. otherwise, we create a set of all the possible next states.

  2. For each possible next state, we check:

    2.1. if the length required is 2: we increase the count by 1 since this is a valid state

    2.2. If the length is more than 2, we check is we already computed this state before in the memoization dictionary. If we did we pull the count value from the dictionary and use that, otherwise we call the function recursively with the starting state and the wanted sequence length.

Timing

When running this function with memoization disabled we get for N=200:

%timeit count_seq_slow(200)
199 ms ± 9.63 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit count_seq(200)
13.6 ms ± 833 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

When increasing to N=2000 we get:

%timeit count_seq_slow(2000)
3.54 s ± 374 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit count_seq(2000)
147 ms ± 7.02 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Tomer Geva
  • 1,764
  • 4
  • 16
  • What is the time/space complexity of this? How many recursive calls are being made? – Abhinav Mathur Apr 07 '22 at 07:50
  • 1
    With the memoization, only `(6*6 - 8)*n` recursion calls are being done. Without the memoization it is obviously much higher, so with memoization it runs in O(n) – Tomer Geva Apr 07 '22 at 08:20
3

We can use the fact that only 6 possible digits can be used to construct numbers.

  1. Consider a lookup table dp[i][j][k], which is basically count of i-digit numbers with the i-th digit as j, (i-1)th digit as k.
  2. Create a mapping of all valid co-primes for each digit. Something like {2: [1,3,5], 3: [1,2,4,5], 6: [1,5]}...
  3. Base cases: dp[0] = 0 for all j,k, dp[1] = 0 for all j,k, dp[2][j][k] = 1 if gcd(j,k)==1 and j!=k
  4. Filling up the table should be relatively straighforward now.
for i in range 2 to n:
for j in range 1 to 6:
for k in range 1 to 6
if j and k are not coprime, dp[i][j][k] = 0 
else, dp[i][j][k] = summation(dp[i-1][k][l]) where l is in range [1,6] AND l!=j and j!=k
# this ensures a min gap of 3 and maintains co-primality of adjacent numbers
  1. Final answer = sum(dp[N][1..6])

This has a time and space complexity of O(N*6*6) ~ O(N).

Edit: @KellyBundy was kind enough to add a Python implementation; corrected it (and a minor flaw in my answer) and added here:

def count_seq(N):
  A = range(1, 7)
  dp = [[[None] * 7 for _ in range(7)] for _ in range(N+1)]
  for j in A:
    for k in A:
      dp[0][j][k] = 0
      dp[1][j][k] = 0
      dp[2][j][k] = int(gcd(j,k)==1 and j!=k)
  for i in range(3, N+1):
    for j in A:
      for k in A:
        if gcd(j, k) != 1:
          dp[i][j][k] = 0 
        else:
          dp[i][j][k] = sum(dp[i-1][k][l] for l in A if (l!=j and j!=k))
  return sum(dp[N][j][k] for j in A for k in A)

N = 100
print(count_seq(N))
Abhinav Mathur
  • 7,791
  • 3
  • 10
  • 24
  • Excellent answer. Next obvious question is how to become so good at DP :D. Do you have any youtube channel or some blogs. – Tarun Apr 07 '22 at 07:59
  • 1
    @Tarun identifying if overlapping subproblems require memoization, then figuring out the states by tracking the properties we need to store/maintain across states. CodeForces is my recommendation for trying out good DP problems – Abhinav Mathur Apr 07 '22 at 10:52
  • Your complexities are both incorrect, you're ignoring that the numbers grow exponentially. I think time complexity is O(N^2), not sure about space. – Kelly Bundy Apr 11 '22 at 11:31
  • @KellyBundy care to explain the `O(N^2)` time? – Abhinav Mathur Apr 11 '22 at 11:34
  • For example the solution for N=1000 has 1499 bits and the solution for N=2000 has 2997 bits ([demo using my answer](https://tio.run/##ZZDNboMwEITvfortzVZRBO0hVSWfeu8LRBEyxoAl/NNlqdSnpzYIkjZ7sjzfeGYdf2gI/vUt4rJ0GBzoMI5Gkw1@AutiQIKPMHsyyFbdKRp2odctY63pkikR9WS@@Kd4Z5BGE4LcnfzCzwWcxVWsWhcQarAeUPneHJY8ZFy884lDyB6uCmhEsaVlfwo52WSZ@N0TO61vCVUO/0vksak4PElQoHy7HZv1mPbiTcoRICVUj8a96SVDV3iWW6WD25bPxHqFhmb0MM2O58bfapxNqiwYi2g98dvvVWVZilNjqR6N72nI0H/m5ZFZll8)). So looks like the number sizes are proportional to where they are (in your case, `i`). And adding numbers takes time proportional to their lengths. – Kelly Bundy Apr 11 '22 at 11:39
  • Also, you're mixing up `n` and `N`, and my Python translation of this gives [wrong results](https://tio.run/##dVJBboMwELzzis3NbmgF7SFVJA5R73zAQhEQJzEBQ42J1NfTXQjgFNUnMjuzMx6n@bHXWn98Nqbvz6auIK/LUuZW1boFVTW1sfBVd9pK4w3zKrXXaXDJT553kmcUIePYym8W870HeHJrIJqUTLCdDzue8GF2rg0cQWkwqb7IWULHyqpxdHwekIalPmTcH91IjyZvCiUtc1ZM7HxxCMn8mUFHYXDYRJBCqk/jZzZ84r1Yhj4cogjCtXBKKoiUwDYaI8288fLEGCAjbWc0tF3FKPE9LTuJkfnY3SG7Kp3eHzUcUOiERuREhQgh4lrLBF5g97c/7HVV6TbkiTf2UBB82M/N35zftF0EiSgScUvQJljQcEHDxya1GLz7QB7L1uJp69pnaJt6LXy4rXtFQ@XEgHkky1b@T6RKCXkNCRDlWEQ5OJNhuYkK6tB5AaTH04IluJMYnyWmSweB1xilLXP/3PyBLY/G@/4X). – Kelly Bundy Apr 11 '22 at 11:55
  • @KellyBundy good catch, minor flaws were there but overall logic was corrected. Fixed both things – Abhinav Mathur Apr 11 '22 at 12:31
  • 1
    Yes, looks like it gives correct results for N >= 2 now. The code crashes for N=0 and N=1, though, and I think for N=0 and N=1 also your description is wrong (your results are 0, correct are 1 and 6). And still mixing up `n` and `N` and the incorrect complexity statement. – Kelly Bundy Apr 11 '22 at 12:44
3

Let K[n, d1, d2] be the number of valid sequences of length n such that the sequence remains valid if d1, d2 appear just before. Or equivalently, the number of valid sequences of length n+2 that start with d1, d2.

There's a recurrence relation that K satisfies:

K[n, d1, d2] = 0 if d1=d2 or d1 is not coprime to d2.
K[0, d1, d2] = 1
K[n, d1, d2] = sum(K[n-1, d2, d] for d=1..6 - {d1, d2})

This K can be calculated using a bottom-up dynamic program, or memoization.

The original problem can be solved for n>=2, using this:

S[n] = sum(K[n-2, d1, d2] for d1=1..6, d2=1..6)

For n<=2, we have S[0] = 1 and S[1] = 6.

One way to write this code, using O(1) space and O(n) time is this:

from math import gcd

def S(n):
    if n == 0: return 1
    if n == 1: return 6
    K = [d1!=d2 and gcd(d1+1, d2+1)==1 for d1 in range(6) for d2 in range(6)]
    for _ in range(n-2):
        K = [0 if d1==d2 or gcd(d1+1, d2+1)!=1 else sum(K[d2*6+d] for d in range(6) if d!= d1 and d!=d2) for d1 in range(6) for d2 in range(6)]
    return sum(K)

This iteratively computes K[n,_,_] from K[n-1,_,_].

Paul Hankin
  • 54,811
  • 11
  • 92
  • 118
  • Excellent solution. How did you spot out its a DP problem. I just could not figured out. – Tarun Apr 07 '22 at 08:30
  • It looks like a DP problem, and then it's just a matter of figuring out how to parameterize it – Paul Hankin Apr 07 '22 at 09:09
  • Seems to be [not quite correct](https://tio.run/##bVLRasIwFH3vV9y9JawO62CKUJjscVD2LiIxTbWQJiVNBzL27e4maWN1BoT03HNyzr3X9mxPWr2uWnO5VEY3wLWUgttaqw7qptXGwofulRUm8fWG2dNYOPIygFWvuNVaRok0/Z4zfhJJUooKPoWUZ1LQdQJ4uDWQj6@SLVmmsKQ76muVNrCHWoFh6iiixB0rmnaio7HgNISlcKApxseq06PJS42SjkyeGNn86pA581uGO3WFpKccGDBVhuvBX7FnckAfCnkO2X/hmHTrSDt4zkOkyAvNO4aHjLC9UdD1DXGJv5nsBUamYW5frJdEDfEwknKm8/Woym7wLOJvHt@gTycsmfQZZvYet0MKrUQA/ZqISqFEZrmYjATfd00PuJvEXduD63yqGJI@4mXJHeCaR@uZN8Af9Tsq3Y42MIOf4PxLkweaRQwcRFlQ@fvC33GU7qu4rnw1NNeaWllSpPHvmYaBFziny@UP) (or maybe I made a mistake in my translation to Python?). – Kelly Bundy Apr 11 '22 at 12:26
  • @KellyBundy yes, you're right. "d1 is not coprime to d2" should be "d1=d2 or d1 is not coprime to d2". 1 is coprime to 1, but K[n, 1, 1] should be 0 because duplicates aren't allowed. Duplicates are avoided in the 3rd rule for K, but aren't avoided when computing S[n]. I've edited the answer and with that change your and my code agree on the results. – Paul Hankin Apr 11 '22 at 13:37
  • Yes, [looks good now](https://tio.run/##bVLRasIwFH3vV9y9JSyOtoM5hMJkj0LxXURqmmohTUqaDmTs291N0nbVGRDSe8/JOfdc24s9a/X63prrtTK6Aa6lFNzWWnVQN602Fj51r6wwke83hT2PjRMvQ7HqFbday4kiTX/gBT@LKCpFBRsh5YXkdBUBHm4NZOOrZEeWDJZ0T32v0gYOUCswhTqJieKOFU0749Gp4TikYHCkDO1j1/FR5KVGSkdmT4xo/qeQOPFbhDt1haCnDAooVBmuR3/FmckRdShkGST/iaPTnQPt4TkLliZcGN4hfMkI2xsFXd8Q5/irkL1AyzTkti16SdRgDy0pJxqvRlZyU0@m@puvr1GnE5bM5gyZfUzbIblWIhT9mohiUCKyTGeR4Ptl4gTKFDA7F8CAcancRTA4iOfswfUjXBLdFVwQaGPhBfBH/b5Kt681LOA7KP/Q6AEnncwHUhJY/p76O8bqvvLZ@uN4GHXDYIuJjX9VFsLPg1JramVJzmDjZtkyj6bR9foL). – Kelly Bundy Apr 11 '22 at 13:45
1

What number can come next only depends on the previous two numbers. So here's an iterative solution that only keeps O(1) numbers around, it's about twice as fast as Tomer's for N=2000 (even without any optimizations, I even call gcd all the time):

from collections import Counter
from math import gcd

def count_seq(N):
    ctr = Counter([(7, 7)])
    for _ in range(N):
        temp = Counter()
        for (a, b), count in ctr.items():
            for c in range(1, 7):
                if c != a and c != b and gcd(b, c) == 1:
                    temp[b, c] += count
        ctr = temp
    return sum(ctr.values())

My ctr is a hash map whose keys are pairs representing the last two numbers, and the values tell how many valid sequences end in those two numbers. Initialized with (7, 7) because that allows all extensions.

For fun and max performance, hardcoding all states and transitions, this is about 14 times faster than my above solution for N=2000 and solves N=100,000 in about 10 seconds (the result is a number with 45,077 digits):

def count_seq(N):
    if N < 2:
        return 6 if N else 1
    x12 = x13 = x14 = x15 = x16 = x21 = x23 = x25 = x31 = x32 = x34 = x35 = x41 = x43 = x45 = x51 = x52 = x53 = x54 = x56 = x61 = x65 = 1
    for _ in range(N - 2):
        x12, x13, x14, x15, x16, x21, x23, x25, x31, x32, x34, x35, x41, x43, x45, x51, x52, x53, x54, x56, x61, x65, = x31+x41+x51+x61, x21+x41+x51+x61, x21+x31+x51+x61, x21+x31+x41+x61, x21+x31+x41+x51, x32+x52, x12+x52, x12+x32, x23+x43+x53, x13+x43+x53, x13+x23+x53, x13+x23+x43, x34+x54, x14+x54, x14+x34, x25+x35+x45+x65, x15+x35+x45+x65, x15+x25+x45+x65, x15+x25+x35+x65, x15+x25+x35+x45, x56, x16
    return x12 + x13 + x14 + x15 + x16 + x21 + x23 + x25 + x31 + x32 + x34 + x35 + x41 + x43 + x45 + x51 + x52 + x53 + x54 + x56 + x61 + x65

Here, for example x23 is the number of valid sequences ending in the numbers 2 and 3. There's room for further microoptimizations (using additional y variables to alternate between x and y instead of building+unpacking tuples, or exploiting common subsums like x21+x41, which is used three times), but I'll leave it at this.

Oh, actually... as one might've seen with Fibonacci numbers, we can represent the transitions as a 22×22 matrix and then quickly compute the Nth (or N-2th) power of that matrix with exponentiation by squaring. That should be even much faster.

Meh... implemented the matrix power method now, and sadly it's slower. I guess it really only helps if you only need some kind of truncated values, like only the first or last 100 digits and the number of digits, or the number of sequences modulo some number. Otherwise, while the matrix power method does need fewer operations, they include multiplications of large numbers instead of just additions, which is slower. Anyway, here's my implementation (Try it online!):

from math import gcd
import numpy as np

def count_seq(N):
    if N < 2:
        return 6 if N else 1
    states = [(a, b) for a in range(1, 7) for b in range(1, 7) if a != b and gcd(a, b) == 1]
    A = [[1 if b2 == b and a != c else 0
          for b2, c in states]
         for a, b in states]
    return (np.matrix(A, dtype=object) ** (N-2)).sum()

As demonstration, here's a modification that computes the last three digits. For N=100,000 it takes about 0.26 seconds and for N=1,000,000,000,000,000 it takes about one second.

def count_seq(N):
    if N < 2:
        return 6 if N else 1
    modulo = 1000
    class IntModulo:
        def __init__(self, value):
            self.value = value
        def __add__(self, other):
            return IntModulo((self.value + other.value) % modulo)
        def __mul__(self, other):
            return IntModulo((self.value * other.value) % modulo)
        def __int__(self):
            return self.value
    states = [(a, b) for a in range(1, 7) for b in range(1, 7) if a != b and gcd(a, b) == 1]
    A = [[IntModulo(1 if b2 == b and a != c else 0)
          for b2, c in states]
         for a, b in states]
    return int((np.matrix(A, dtype=object) ** (N-2)).sum())
Kelly Bundy
  • 23,480
  • 7
  • 29
  • 65