0

So my problem states:

I have a List of N lower case alphabet String each having length M. Lets call it S

eg: for m=3 {aab,aac,aba,abc,bab,bac,bba}

Then I have Q query string which also have length m and contains lower case characters and '?'.

For each query string return count of the string that matches query string.

NOTE: '?' is equal to any character.

Sample:

for query string 'a?b' answer=1, {aab}

for query string 'a??' answer=4, {aab,aac,aba,abc} etc..

What I have done so far

BRUTE FORCE

for each query q:
   count=0
   iterate over the list of String in s:S
      if q.charAt(i) != '?' || s.charAt(i) != q.charAt(i)
        flag=false;
        break; 
   if flag==true
     count+=1
   print count 

Time: O( Q * N * M )

I came up with another approach using binary search:

  • Sort the List of Strings.
  • Consider the List of String as a Grid of N * M characters
  • For each query string instead of matching each string in the list match character by character recursively.
  • when character matches consider the set of string which matches only. eg: for character at position pos if Grid[l][pos] to Grid[r][pos] matches then pass this subset of the Grid to recursive call.
  • If the character at pos is '?' then assign values from [a...z] to character and solve for each value at this position.

Pseudo Code:

function: (l,r) binarySearch(List, ch, start, end);

function recursion(List, query, pos, start, end)
  result=0
  if query.charAt(pos) != '?'
    (l,r) binarySearch(List, query.charAt(pos), start, end)
    if pos = query.length:
       return length(l,r)
    if (l,r) not empty:
     result += recursion(List, query, pos+1, l, r)
  else
   if pos = query.length
      return length(start,end)
   for ch in [a...z]
     (l,r) binarySearch(List, ch, start, end)
     if (l,r) not empty:
       result+=recursion(List, query, pos+1, l, r)
  return result

function doSomething(List, Query)
  List : sort the list alphabetically
  for each query in Query:
    ans = recursion(List, query, 0, 0, N)

Worst case:

If all the query string are [???, ???, ??? ...] then it comes out to be O( Q * N * M ). Still this looks like a better solution than Brute Force.

Still I feel like if we can Pre-Process the List before hand and answer Q queries in less time. I can't figure out the logic to pre-process so that queries will take less time. Any help is appreciated.

Mohd Waseem
  • 1,244
  • 2
  • 15
  • 36
  • I didn't read the entire thing but this post might help you: https://stackoverflow.com/questions/7067161/wildcard-string-search-algorithm – Amongalen May 18 '20 at 08:51

1 Answers1

0

To pre-process the given N strings, you can build a Trie and then while querying you can go down level by level (each character as one level). If you have '?' in any of the level (in query string), you need to visit all neighbouring nodes otherwise just depth wise traversal is enough. This will reduce your search time in liner. Depth ==> max length of given search string.

gymk
  • 1
  • 2