I've got a short function to check whether a word is a real word by comparing it to the WordNet corpus from the Natural Language Toolkit. I'm calling this function from a thread that validates txt files. When I run my code, the first time the function is called it throws a AttributeError with the message
"'WordNetCorpusReader' object has no attribute '_LazyCorpusLoader__args'"
When I pause execution, the same line of code does not throw an error, so I assume that the corpus is not yet loaded on my first call causing the error.
I have tried using nltk.wordnet.ensure_loaded()
to force load the corpus, but I'm still getting the same error.
Here's my function:
from nltk.corpus import wordnet as wn
from nltk.corpus import stopwords
from nltk.corpus.reader.wordnet import WordNetError
import sys
cachedStopWords = stopwords.words("english")
def is_good_word(word):
word = word.strip()
if len(word) <= 2:
return 0
if word in cachedStopWords:
return 0
try:
wn.ensure_loaded()
if len(wn.lemmas(str(word), lang='en')) == 0:
return 0
except WordNetError as e:
print "WordNetError on concept {}".format(word)
except AttributeError as e:
print "Attribute error on concept {}: {}".format(word, e.message)
except:
print "Unexpected error on concept {}: {}".format(word, sys.exc_info()[0])
else:
return 1
return 1
print (is_good_word('dog')) #Does NOT throw error
If I have a print statement in the same file at the global scope, it does not throw the error. However, if I call it from my thread, it does. The following is a minimal example to reproduce the error. I've tested it and on my machine it gives the output
Attribute error on concept dog: 'WordNetCorpusReader' object has no attribute '_LazyCorpusLoader__args'
Attribute error on concept dog: 'WordNetCorpusReader' object has no attribute '_LazyCorpusLoader__args'
Attribute error on concept dog: 'WordNetCorpusReader' object has no attribute '_LazyCorpusLoader__args'
Attribute error on concept dog: 'WordNetCorpusReader' object has no attribute '_LazyCorpusLoader__args'
Attribute error on concept dog: 'WordNetCorpusReader' object has no attribute '_LazyCorpusLoader__args'
Attribute error on concept dog: 'WordNetCorpusReader' object has no attribute '_LazyCorpusLoader__args'
Attribute error on concept dog: 'WordNetCorpusReader' object has no attribute '_LazyCorpusLoader__args'
Attribute error on concept dog: 'WordNetCorpusReader' object has no attribute '_LazyCorpusLoader__args'
Attribute error on concept dog: 'WordNetCorpusReader' object has no attribute '_LazyCorpusLoader__args'
Minimal example:
import time
import threading
from filter_tag import is_good_word
class ProcessMetaThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
def run(self):
is_good_word('dog') #Throws error
def process_meta(numberOfThreads):
threadsList = []
for i in range(numberOfThreads):
t = ProcessMetaThread()
t.setDaemon(True)
t.start()
threadsList.append(t)
numComplete = 0
while numComplete < numberOfThreads:
# Iterate over the active processes
for processNum in range(0, numberOfThreads):
# If a process actually exists
if threadsList != None:
# If the process is finished
if not threadsList[processNum] == None:
if not threadsList[processNum].is_alive():
numComplete += 1
threadsList[processNum] = None
time.sleep(5)
print 'Processes Finished'
if __name__ == '__main__':
process_meta(10)