1

I'm trying to implement trie search in flutter. And here's the entire trie.dart file.

The idea is something like this, say we have I have a list of recipe names:

Burger
French Fries
Ice Cream
Garlic Parmesan Butter

Now I need to search using prefix so if the user searches for bur it'll show Burger. But if someone write Garlic Butter I need to return Garlic Parmesan Butter. So, basically if the search query has multiple words I need to show the correct name.

Here's the part where I get all words with prefix:

  List<String> getAllWordsWithPrefix(String prefix) {
    StringBuffer fullPrefix = new StringBuffer();
    return _getAllWordsWithPrefixHelper(prefix, _head, fullPrefix);
  }

  List<String> _getAllWordsWithPrefixHelper(
      String prefix, _TrieNode node, StringBuffer fullPrefix) {
    if (prefix.length == 0) {
      String pre = fullPrefix.toString();
      return _collect(
          new StringBuffer(pre.substring(0, max(pre.length - 1, 0))), node, []);
    }

    for (_TrieNode child in node.children) {
      if ((child.char == prefix.substring(0, 1)) ||
          (!_isCaseSensitive &&
              child.char.substring(0, child.char.length).toLowerCase() ==
                  prefix.substring(0, 1).toLowerCase())) {
        fullPrefix.write(child.char);
        return _getAllWordsWithPrefixHelper(
            prefix.substring(1), child, fullPrefix);
      }
    }

    return [];
  }

And finally I'm using the trie search in the following way(thought this might help someway):

class Search {
  final Map<String, RecipeModel> _map = Map.fromIterable(
    Store.instance.getAllRecipes(),
    // ignore: non_constant_identifier_names
    // key: (recipe) => RecipeModel().recipeName!,
    key: (recipe) => recipe.recipeName!,
  );

  late final Trie trie;
  Search() {
    // This will be O[n]
    trie = Trie.list(_map.keys.toList());
  }

  late RecipeModel recipe;

  RecipeModel returnRecipe(String? suggestion) {
    if (suggestion == null) return recipe;
    // This will be O(1) instead of O(n) [better]
    final RecipeModel? found = _map[suggestion];
    return found ?? recipe;
  }

  List<String> returnSuggestions(String prefix) {
    //will return O[W*L] ad-hoc search was O[n^2]
    return trie.getAllWordsWithPrefix(prefix);
  }
}

1 Answers1

0

First, your TrieNode is using a List, which means you have to do a for-loop O(N) search for each char. It would be faster to use a Map.

(You could also use a 26D array, instead of a map, if you know that you'll only have English letters: [0] = 'A', [1] = 'B', ... [25] = 'Z')

Part I: garlic butter

Now I need to search using prefix so if the user searches for bur it'll show Burger. But if someone write Garlic Butter I need to Garlic Parmesan Butter. So, basically if the search query has multiple words I need to show the correct name.

bur is easy to do, as that's what a Trie data structure is for, but the garlic butter one is harder. You probably want to look into how to implement a fuzzy search or "did you mean...?" type algorithm:

However, maybe this is more complex than you want.

Here are some alternatives I thought of:

Option #1

Change your TrieNode to store a String variable stating the "standard" word. So the final TrieNode of garlic parmesan butter and garlic butter would both have an instance variable that stores garlic butter (the main standard/common word you want to use).

Your TrieNode would be like this:

class _TrieNode {
  //...other vars...
  String _standardWord;
}

Then your add method would look like this:

String standardWord = 'garlic butter';
trie.addWord('garlic butter', standardWord);
trie.addWord('garlic parmesan butter', standardWord);

Internally, your add method would set the standardWord of the last char's TrieNode to the 2nd param of addWord.

Internally, your find method would return the standardWord, which was set in the last char's TrieNode.

Then you only have to test for garlic butter and don't have to worry about parmesan.

Does that make sense?

Of course, you can also flip this around so that it always returns garlic parmesan butter instead:

String standardWord = 'garlic parmesan butter';
trie.addWord('garlic butter', standardWord);
trie.addWord('garlic parmesan butter', standardWord);

This requires that you know all phrases that it could be in advance, and then you add them to your Trie, with all common phrases pointing to the same standard word.

Option #2

Split your phrase/sentence into words, based on the spaces between words.

So your Trie would look like this:

trie.addWord('garlic');
trie.addWord('parmesan');
trie.addWord('butter');

When you have all of the matching words, you then need to write an algorithm (logic) to piece them together in a meaningful way.

Here's an example:

Set<String> wordsFromTrie = {};
List<String> words = "garlic parmesan butter".split(RegExp(r'\s+'));

words.forEach((word) => wordsFromTrie.add(trie.findWord(word)));

if(wordsFromTrie.contains("garlic") && wordsFromTrie.contains("butter")) {
  print("Garlic parmesan butter!");
}

Option #3

Use some type of Regex matching instead of a Trie, so for example, your Regex would be RegExp(r"garlic.*butter",caseSensitive: false) and this would match all garlic butter regardless of what's in the middle.

This will be a bit slower, as you'll have a bunch of if-statements with Regexes. You could make it faster by first testing someString.startsWith('garlic') (after stripping spaces and lower-casing).

Option #4

Use a combination of #1 and #3.

This would mean you'd have a special RegexTrieNode that you would add.

trie.addWord('garlic',RegExp(r'[^b]+'),'butter');

When you hit RegexTrieNode it would continue to match each char until it stops matching. When it stops matching, you would need to go to the next child, butter, for the rest of the matching.

It's quite complicated, and it won't work for all Regexes, but it's doable.

Part II: bur => burger

Basically, once you reach the end of bur, you need to keep going down (or up?) the Trie for TrieNodes that only have 1 child.

Why only 1 child? Well, if you have burrito and burger, which does bur match? It's ambiguous.

Here's some example code, roughly based on your code. It uses null-safety. If you don't have that enabled, then remove all ? from the code.

import 'dart:math';

void main() {
  var trie = Trie();

  trie.addWord("burger");

  // All results will be "burger".
  print(trie.findWord("b"));
  print(trie.findWord("bur"));
  print(trie.findWord("burger"));

  // This will be null, but fixable
  // if want to allow longer strings.
  print(trie.findWord("burgerme"));
}

class Trie {
  TrieNode head = TrieNode();

  void addWord(String? word) {
    if(word == null || word.isEmpty) {
      return;
    }

    var currentNode = head;

    // Rune is a unicode codepoint (unicode char).
    word.runes.forEach((rune) {
      var childNode = currentNode.children[rune];

      if(childNode == null) {
        childNode = TrieNode();
        currentNode.children[rune] = childNode; // Add to parent.
      }

      currentNode = childNode; // Next node.
    });

    // Last node is the last char.
    currentNode.endOfWord = true;
  }

  String? findWord(String partial) {
    var word = StringBuffer();
    var currentNode = head;

    partial.runes.forEach((rune) {
      var childNode = currentNode.children[rune];

      if(childNode == null) {
        return null; // Not found.
      }

      word.writeCharCode(rune);
      currentNode = childNode; // Next node.
    });

    // Prevent "burgerme" from matching to "burger".
    // Uncomment this if-block if want to allow it.
    if(currentNode.endOfWord && partial.length > word.length) {
      return null;
    }

    // This logic allows "bur" to match to "burger".
    while(!currentNode.endOfWord) {
      // Ambiguous: "bur" could match either "burger" or "burrito".
      if(currentNode.children.length != 1) {
        return null; // Don't know.
      }

      var onlyChild = currentNode.children.entries.first;

      word.writeCharCode(onlyChild.key);
      currentNode = onlyChild.value; // Next node.
    }

    return word.toString();
  }
}

class TrieNode {
  Map<int,TrieNode> children = {};
  bool endOfWord = false;
}
esotericpig
  • 282
  • 1
  • 3
  • 14
  • Hi, thanks for the answer. I have already created a trie out of the "standard word". It's coming from the JSON file. So why would I need to create a trie out of the query word? –  Jun 04 '21 at 20:44
  • But if you call `findWord('garlic butter')` and `findWord('garlic parmesan butter')` they will produce 2 different results, right? I was just thinking a standard, shared word so that only have to test/check one string. Either way is fine. It's up to you on what you need and how you want to design it. – esotericpig Jun 05 '21 at 09:06
  • well that's what confuses me. If you see my search class I'm building the trie using the data I'm getting from the json file. That's one thing. Since I'm using a predictive search I'll need to build a trie when the user types each letter right?. If it's not a big trouble can you write a small pseudo code on top of my provided `trie`? I've already provided the llink. I think I'm not getting something –  Jun 05 '21 at 12:15
  • I'm sorry, but I don't have the time. If you try to make some changes yourself to match mine, then I can help you fix any problems you have. Else, hopefully someone else will come along and leave another answer. – esotericpig Jun 07 '21 at 07:55