string with no space breaking using trie

Question

I still struggle to understand all the steps in an algorithm way on how I should do to have all possible partitions of an input string with no space based on a dictionary.

I read that we must use a TRIE structure use some Dynamic Programming with Back tracking ( Depth-first search ) but I don't understand conceptually how it works.

First what I will like to have:

input 
string = `theology`

    dictionnary 
    dict = [the , theology]

output i am expecting

[the, o, l, o, g, y ] ,[theology]

On my "bad understanding here is the step i see" ( I use R language but i more need to undestand the logic than have a specific language answer)

0 - Transform my dictionnary into a trie 

```
library("triebeard")
trie <- trie(keys = c("the", "theology","toto","what"),
             values = c("the", "theology",
             "toto","what"))
```

1 - calculate the number of character on the string 
    - On this case I have 8.

2 - Take character by character n+1 and look inside the trie

2.1 I take the first character is t so i look  at t and do the longest match 
    It return NA so i go to the next character
    ```
    longest_match(trie, "t")
    ```
    2.2 Next char `"th"` returns NA 

2.3 Next char "the" returns `"the"` so i add it the a list 

3a - Now i don't know what to do should I continue to the 4 position take one character
look at the longest match see NA and so on so forth 

3b - Or should go and look at  `"theo"`

I guess there is something to do with the prefix match instead of the longest match.

If you can guide me on how to get the expected output using trie

[Update]

To help better understand my problem here is a possible solution in python using dictionary that i found / modified thanks to the answer of interjay on this : https://stackoverflow.com/a/2174156/5348895

wordList = ['the','theology'] words = set( s.lower() for s in wordList ) def splitString(s): found = [] def rec(stringLeft, wordsSoFar): if not stringLeft: found.append(wordsSoFar) for pos in xrange(1, len(stringLeft)+1): if stringLeft[:pos] in words: rec(stringLeft[pos:], wordsSoFar + [stringLeft[:pos]]) elif (pos==len(stringLeft)) and stringLeft[:pos] not in wordList and len(stringLeft)<len(s): rec(stringLeft[1:], wordsSoFar + [stringLeft[:1]]) rec(s.lower(), []) return found output = splitString('theology') for x in output: print ','.join(x)

Thanks

R and Python at the same time? is that really appropriate? I don't really know that much about Python, but there doesn't seem to be that much overlap with R. There are definitely similarities. I am not a programmer though, so there might be programming concepts that make talking about them simultaneously worthwhile, but the implementation could be different. Also, is the "i" in your expected output from "the, theology" (no "i" in there) a typo or an actual expected output? — shea, Jun 26 '17 at 13:47
yes thanks it's a typo error for the i , let me remove it. I put R because i am from a R background but i guess python is much more suited for string manipulation. I am more looking at the steps to solve it more than the resolution itself. In any case let me remove R to don't get people confuse — patrick, Jun 26 '17 at 14:35
@patrick can you add a link to your question so i can understand it clearly. Like i think that [t, h, e, o, l, o, g, y] should also be there in answer and like if you have "olog" in dictionary than is [t, h, e, olog, y] allowed or we can only use dictionary words at start. — monster, Jun 27 '17 at 14:53
The answer should contain first what is on the dictionary , since ['the'] is on the dictionnary [t,h,e,...] cannot be one the solution. I have edited my main post with a python answer of input output using a dictionary instead of a trie, so maybe my question will be a little more clear. — patrick, Jun 28 '17 at 00:43

string with no space breaking using trie

0 Answers0