If I use str_word_count() php function on russian text it will return invalid result. The work around is to use something like:
function my_word_count($str) {
return count(preg_split('~[^\p{L}\p{N}\']+~u',$str));
}
However , this function may not work with text on another language, complicating the task much more and probably you will have to write individual str_word_count() for every language out there. So, provided that the input may be ASCII or UTF8, does a generic multi-language function exist to count words on any language?