0

I made this trie

class Trie:
    def __init__(self):
        self.root = {}
        self.end = '\0'

    def add(self,text):
        node = self.root 
        for letter in text:
            if letter in node:
                node = node[letter]
            else:
                node[letter]={}
                node = node[letter]
        node[self.end] = True
        
        
    def addAll(self,lst):
        for word in lst:
            self.add(word)

And made this function to print all the words in the trie

# takes in trie.root
def printWords(root,word=""):
    if root==True:
        print(word[:-1])
        return

    for letter in root:
        printWords(root[letter],word+letter)

How can I find the big O of this function printWords.

carlosdafield
  • 1,479
  • 5
  • 16
  • fixed it I called it func before but changed it to printWords so it would make more sense here – carlosdafield Nov 27 '21 at 21:11
  • `printWords` misses following words in cases like `the` and `there`, just a head up. Other than that it's quadratic in the number of letters of the longest word (this only works for one word but let's assume you add that to the function). The reason is that you concatenate the word letter by letter which copies and creates new strings. – user1984 Nov 27 '21 at 21:23
  • I ran the code and `the` and `there` still printed out. The loop doesn't stop at the first time `printWords()` gets called, it keeps going – carlosdafield Nov 27 '21 at 21:47
  • right, I misread how the trie is built, but the TC is O(n^2), as far as I can tell. – user1984 Nov 27 '21 at 21:51
  • What is `n` in this context? Number of nodes? Total length of input? – amit Nov 27 '21 at 21:54
  • `n` is the length of the longest word. number of letters. the reason for the quadratic time is the repeated string concatenation. @amit – user1984 Nov 27 '21 at 22:26
  • Does this answer your question? [Big O, how do you calculate/approximate it?](https://stackoverflow.com/questions/3255/big-o-how-do-you-calculate-approximate-it) – Mark Rotteveel Nov 30 '21 at 13:36

2 Answers2

1

The worst time complexity is O(²) where is the number of words, and the size of the longest word.

The evaluation of word+letter has a time complexity of O(len(word)). And for a given word that occurs in the trie, the evaluation of word+letter happens O() times, each time extending the current prefix with one character, until the word is found. This gives a time complexity of O(²) per word.

As there are words in the tree, and in the worst case they don't share any prefix, this amounts to a total complexity of O(²)

trincot
  • 317,000
  • 35
  • 244
  • 286
1

Big O should be the upper bound, in other words, the worst case scenario.

If the tree stores n words with lengths m_0, m_1, ... , m_(n-1) then in the worst case, they define n disjoint paths along the tree with the aforementioned lengths (actually each path is two nodes longer than the word it represents, taking into account the root node and the end node, but that's immaterial for the computation of big O). So in the worst case you make m_0 + m_1 + ... + m_(n-1) calls to PrintWords.

However, in case the tree stores many words, the estimation above is excessive, because the number of children a node can have is bounded (at most 27 - one for each letter + the end sign). So if we denote the tree's height by h (where h = max(m_0, m_1, ... , m_(n-1)), then there can be at most min(m_0 + m_1 + ... + m_(n-1), 27×h)calls to PrintWords.

Since we are computing an asymptotic bound, we can assume that n is very large, much larger than h, hence the worst case becomes 27×h calls, which is O(h), where h is the length of the longest word.

You can go one step further and assume that the length of each word is bounded (that's the case for words in natural languages). Assuming that, you get that the asymptotic bound is O(1). That doesn't mean, of course, that the operation is very quick, it could take a while. But since, under the last assumption, the tree's size is bounded (both height and width), you perform an operation over a constant size (possibly very big) data structure, hence it does not become asymptotically more complex. Your tree becomes like a word lexicon - even if it contains all the words of a certain language, it must have a finite size, thus going through all the words can be done in a finite amount of time.

Nir H.
  • 550
  • 2
  • 9