Please look at this JSFiddle: https://jsfiddle.net/nu69kxyq/
This JS takes in input text file consisting of one word each line. The JS function finds longest and second longest COMPOUND word (made up of only other words in file) and total number of these compound words.
This is the exact input I'm using:
cat
cats
catsdogcats
dog
dogcatsdog
hippopotamuses
rat
ratcatdogcat
The output should be:
ratcatdogcat <-- longest compound word (12 characters)
catsdogcats <-- second longest compound word (11 characters)
3 <-- total number of compound words in a file
Total number of compound words is 3 because catsdogcats, dogcatsdog, ratcatdogcat.
I first take in all words then sort it in wordsList
. Then I create a hash table of words for reference (to check for compound words) in wordDict
:
var wordsList = list.sort(function(a, b) {
return a.length - b.length; //sort words by length ascending
});
var wordDict = list.reduce(function(words, value, index) { //hash table from text file data
words[value] = true;
return words;
},{});
var isConcat = function(word) {
for (var i = word.length; i > -1; i--){
var prefix = word.slice(0,i);
var suffix = word.slice(i, word.length);
if (wordDict[prefix] === true){ //????? THIS IS ALWAYS FALSE EVEN WHEN THE KEY'S VALUE IS TRUE!!!
if (suffix.length === 0) {
return true; //all suffix are prefixes. word is concatenated.
}
return isConcat(suffix); //continue breaking remaining suffix into prefix if possible
}
}
return false;
};
The problem is in isConcat
, when I check against wordDict[prefix]
to see if key's value is true, it is always false. Even when it should be true! I tried stepping through code and when key value is 'true' for 'word' key, the if (wordDict[prefix] === true)
statement still doesn't execute because it thinks it's false. Using data.split("\n");
, I'm able to read text file and put it in array. No issues there. So what is going wrong?
Note: I've tried using var list = data.match(/\w+/g);
instead of var list = data.split("\n");
to match all alphanumeric characters as words, instead of splitting them by new line, and the function worked (wordDict[prefix] worked as expected). BUT this regex skips some words when I pass in a text file of over 150,000 text words. I think I need to use data.split("\n")
. What is going wrong here?