I want to tokenize a sentence that has adjacent words, as following:
"This is a samplestring that Iwanttotokenize."
In above example, there are two cases "samplestring" & "Iwanttotokenize" where adjacent words appear. Any idea how to make tokens of these words?
For this sentence, ideal output should be (one token per line): This is a sample string that I want to tokenize