I am working on a topic modelling task and have the unknown topics in the following form
topic = 0.2*"firstword" + 0.2*"secondword" + 0.2*"thirdword" + 0.2*"fourthword" + 0.2*"fifthword"
I want a regex.findall() function to return a list containing only the words e.g :
['firstword', 'secondword', 'thirdword', 'fourthword', 'fifthword']
I have tried using the regex functions :
regex.findall(r'\w+', topic) and
regex.findall(r'\D\w+', topic)
but none of them can eliminate the numbers properly. Can someone help me find out what I am doing wrong?