4

(UPDATED)

We need to find the number of ways a given string can be formed from a matrix of characters.

We can start forming the word from any position(i, j) in the matrix and can go in any unvisited direction from the 8 directions available across every cell(i, j) of the matrix, i.e

(i + 1, j)
(i + 1, j + 1)
(i + 1, j - 1)
(i - 1, j)
(i - 1, j + 1)
(i - 1, j - 1)
(i, j + 1)
(i, j - 1)

Sample test cases:

(1) input:

N = 3 (length of string) 
string = "fit"
matrix:   fitptoke   
          orliguek   
          ifefunef 
          tforitis

output: 7

(2) input:

N = 5 (length of string) 
string = "pifit"
matrix:   qiq
          tpf
          pip
          rpr

output: 5

Explanation:

num of ways to make 'fit' are as given below:

(0,0)(0,1)(0,2)
(2,1)(2,0)(3,0)
(2,3)(1,3)(0,4)
(3,1)(2,0)(3,0)
(2,3)(3,4)(3,5)
(2,7)(3,6)(3,5) 
(2,3)(1,3)(0,2)

I approach the solution as a naive way, go to every possible position (i,j) in the matrix and start forming the string from that cell (i, j) by performing DFS search on the matrix and add the number of ways to form the given string from that pos (i, j) to total_num_ways variable.

pseudocode:

W = 0
for i : 0 - n:
   for j: 0 - m:
        visited[n][m] = {false}
        W += DFS(i, j, 0, str, matrix, visited);

But it turns out that this solution would be exponential in time complexity as we are going to every possible n * m position and then traversing to every possible k(length of the string) length path to form the string.

How can we improve the solution efficiency?

גלעד ברקן
  • 23,602
  • 3
  • 25
  • 61
tusharRawat
  • 719
  • 10
  • 24
  • At first sight it seems to me like problem of coverage rather than problem of eliminating exponential complexity. For (one-dimensional) matrix `abababa` and string `aba`, the algorithm must be certain to find two boundary occurences, not the only one in the middle. – Tomáš Záluský Dec 30 '19 at 13:31
  • 1
    Please post the link to the actual problem – Raj Dec 30 '19 at 13:32
  • 2
    See *Boggle* [tag](https://stackoverflow.com/tags/boggle/info) for ideas. – Guy Coder Dec 30 '19 at 15:14
  • 2
    Why is `(2,7)(3,6)(3,5)` not a valid solution? And what about `(2,3)(1,3)(0,2)`? – Jim Mischel Dec 31 '19 at 05:13
  • Thanks, @GuyCoder, I'll check that out for sure – tusharRawat Dec 31 '19 at 14:14
  • @GuyCoder, reviewing this problem after almost 3 months. I came to know that, The idea of DP given in the blog http://exceptional-code.blogspot.com/2012/02/solving-boggle-game-recursion-prefix.html, will only work if every character in the string we want to construct would be unique. Thus for a string like "fff" to be found on the boggle_board : gft fyu aff DP Solution will give 10 ways though there are only 4 possible ways. So the blog doesn't really answer my question. – tusharRawat Mar 23 '20 at 18:32
  • @tusharRawat How did you come to the conclusion that the DP solution gives 10 as the answer? I think the solution in the blog post is correct, although it shouldn't have used a boolean `dp` array, but should rather use an int array: `f[t][x][y]` stores the number of ways to match the first `t` characters of the string according to Boggle rules and ending at `(x,y)` on the board. – Zecong Hu Mar 26 '20 at 02:45
  • @ZecongHu, Because I coded out that blog's DP solution and ran it on the above input and it gave 10 as a solution though only 4 possible ways are there. If you have an implementation to that DP solution which is giving the right result please post as an answer. – tusharRawat Mar 28 '20 at 09:11
  • @tusharRawat Ah, I see the problem now. The DP solution does not keep track of which positions have been visited, so it counts invalid strings that visit the same position multiple times. Personally I don't think there exists an algorithm than runs in polynomial time, but I believe there are ways to make the exponential search algorithm faster, such as pruning and meet-in-the-middle techniques. – Zecong Hu Mar 28 '20 at 19:44

2 Answers2

0

Suggestion - 1: Preprocessing the matrix and the input string

We are only concerned about a cell of the matrix if the character in the cell appears anywhere in the input string. So, we aren't concerned about a cell containing the alphabet 'z' if our input string is 'fit'.

Using that, following is a suggestion.

  1. Taking the input string, first put its characters in a set S. It is an O(k) step, where k is the length of the string;
  2. Next we iterate over the matrix (a O(m*n) step) and:
    1. If the character in the cell does not appear in the S, we continue to the next one;
    2. If the character in the cell appears, we add an entry of cell position in a map of > called M.
  3. Now, iterating over the input (not the matrix), for each position where current char c appears, get the unvisited positions of the right, left, above and below of the current cell;
  4. If any of these positions are present in the list of cells in M where the next character is present in the matrix, then:
    1. Recursively go to the next character of the input string, until you have exhausted all the characters.

What is better in this solution? We are getting the next cell we need to explore in O(1) because it is already present in the map. As a result, the complexity is not exponential anymore, but it is actually O(c) where c is the total occurrences of the input string in the matrix.


Suggestion - 2: Dynamic Programming

DP helps in case where there is Optimal Substructure and Overlapping Subproblems. So, in situations where the same substring is a part of multiple solutions, using DP could help.

Ex: If we found 'fit' somewhere then if there is an 'f' in an adjacent cell, it could use the substring 'it' from the first 'fit' we found. This way we would prevent recursing down the rest of the string the moment we encounter a substring that was previously explored.

displayName
  • 13,888
  • 8
  • 60
  • 75
  • In step 5, how do you prevent visiting a node multiple times? How does this answer differ from mine in terms of complexity, considering that O(8n) = O(n)? – גלעד ברקן Mar 30 '20 at 11:11
  • (If it does differ, could you please present code or pseudocode that could allow us to compare performance? The current answer leaves too much to my imagination.) – גלעד ברקן Mar 30 '20 at 11:17
  • @גלעדברקן: Your comment prompted me to look in another direction. I think, especially in cases where substrings overlap, a DP solution may reduce the overall time complexity. (1/2) – displayName Mar 30 '20 at 14:29
  • Other than that, the preprocessing may not be reducing the Big-O but it does reduce the constant since you know already which cells you need to explore next. If you don't do the pre-processing, you go to each adjacent cell which may or may not be a productive search. (2/2) – displayName Mar 30 '20 at 14:29
-1
# Checking if the given (x,y) coordinates are within the boundaries
# of the matrix
def in_bounds(x, y, rows, cols):
    return x >= 0 and x < rows and y >= 0 and y < cols

# Finding all possible moves from the current (x,y) position
def possible_moves(position, path_set, rows, cols):
    moves = []
    move_range = [-1,0,1]
    for i in range(len(move_range)):
        for j in range(len(move_range)):
            x = position[0] + move_range[i]
            y = position[1] + move_range[j]
            if in_bounds(x,y,rows,cols):
                if x in path_set:
                    if y in path_set[x]:
                        continue
                moves.append((x,y))
    return moves

# Deterimine which of the possible moves lead to the next letter 
# of the goal string
def check_moves(goal_letter, candidates, search_space):
    moves = []
    for x, y in candidates:
        if search_space[x][y] == goal_letter:
            moves.append((x,y))
    return moves

# Recursively expanding the paths of each starting coordinate
def search(goal, path, search_space, path_set, rows, cols):
    # Base Case
    if goal == '':
        return [path]
    x = path[-1][0]
    y = path[-1][1]
    if x in path_set:
        path_set[x].add(y)
    else:
        path_set.update([(x,set([y]))])

    results = []
    moves = possible_moves(path[-1],path_set,rows,cols)
    moves = check_moves(goal[0],moves,search_space)

    for move in moves:
        result = search(goal[1:], path + [move], search_space, path_set, rows, cols)
        if result is not None:
            results += result
    return results

# Finding the coordinates in the matrix where the first letter from the goal
# string appears which is where all potential paths will begin from.
def find_paths(goal, search_space):
    results = []
    rows, cols = len(search_space), len(search_space[0])
    # Finding starting coordinates for candidate paths
    for i in range(len(search_space)):
        for j in range(len(search_space[i])):
            if search_space[i][j] == goal[0]:
                # Expanding path from root letter
                results += search(goal[1:],[(i,j)],search_space,dict(),rows,cols)
    return results

goal = "fit"
matrix = [
        'fitptoke',
        'orliguek',
        'ifefunef',
        'tforitis'
        ]

paths = find_paths(goal, matrix)
for path in paths:
    print(path)
print('# of paths:',len(paths))

Instead of expanding the paths from every coordinate of the matrix, the matrix can first be iterated over to find all the (i,j) coordinates that have the same letter as the first letter from the goal string. This takes O(n^2) time.

Then, for each (i,j) coordinate found which contained the first letter from the goal string, expand the paths from there by searching for the second letter from the goal string and expand only the paths that match the second letter. This action is repeated for each letter in the goal string to recursively find all valid paths from the starting coordinates.

xChesster
  • 1
  • 3
  • Actually, the code you posted is just an elaborated version of the OP's code. The `search` function is still exponential w.r.t the length of the string to find. – Zecong Hu Mar 26 '20 at 17:10
  • That's almost the same approach I did, so just saving the indexes (i, j) for first char of string is not going to improve the solution efficiency. – tusharRawat Mar 29 '20 at 08:10