I've a numpy array, matrixValue
, and 3 lists containing the following:
matrixValue: (type: ndarray) number of occurences of a word in wordlist in descending order
[.. 62 62 ..]
[.. 23 21 ..]
[.. 14 13 ..]
valueList: (type: list) number of occurences of a word in wordlist in descending order
[... 74, 71, 63, 62, 62, 50, 40, 23, 21, 14, 13, 11, 11...]
userGivenWord: (type: list) user-specified words
[... water, animal, flower...]
wordList: (type: list) contains a list of English dictionary words
[.. water, ocean, lake, green, blue, sea...]
Given a user-defined word, I'm to retrieve words from the wordlist for which there is some "semantic similarity". My problem is that for any repeated occurences (e.g. 62, 11) as seen in valueList, only the first english word 'lake' is printed and not sea (assume that both blue and lake occur 62 times each).
Output
water = lake, lake # wrong output
water = lake, sea # correct output
Here is the part of code that I'm very sure is causing the problem:
for i in range(0, 3): # printing top 3 words
value = matrixValue[i] # returns the first two numbers, 62 & 62
iValue = valueList.index(value) # returns the indexes in valueList for the above value
tagword = str(tag_list[iValue]) # retrieves the word based on the iValue
res = userTags[x] + " = " + tagword
Again, both lake and sea occur 62 times. I believe the error happens in the second line of the for-loop. When I viewed in the debugger, I noticed the word 'lake' is added twice to the result list (and not sea). I'm not sure if I'm being coherent here or if I am clear in writing the question... but please feel free to ask me questions if clarification needs to be made.