I am trying to split a sentence/phrase in to words using Regex.
var phrase = "This isn't a test.";
var words = Regex.Split(phrase, @"\W+").ToList();
words contains "This", "isn", "t", "a", "test"
Obviously it's picking up the apostrophe and splitting on that. Can I change this behavior? It also needs to be multilingual supporting a variety of languages (Spanish, French, Russian, Korean, etc...).
I need to pass the words in to a spellchecker. Specifically Nhunspell.
return (from word in words let correct = _engine[langId].Spell(word) where !correct select word).ToList();