I'm trying to separate a sentence word by word but it seems like it is a very hard task with JavaScript. I can't simply separate the sentence by looking at the whitespace. Because there are languages (Thai, Chinese, Japanese, etc.) that don't use whitespace to separate words. Therefore a dictionary-based algorithm seems like the way to go. However, the dictionaries have a large size and I'm trying to separate the sentence on the client.
Java has a BreakIterator class that allows you to iterate through the words in the sentence. That's exactly what I need but JS doesn't have the same functionality. Chrome has Intl.v8BreakIterator but I'm looking for a solution for all major browsers.
There is a proposal, Intl.Segmenter, that would solve the issue. It's basically BreakIterator on Javascript. But it wasn't released yet.
If there is way, can you please point me in the right direction?