I have a Gensim bigram output that I need to append to a list. The actual output is
output = ['stai', 'fantast', 'intern', 'hotel', 'la', 'vega', 'hotel', 'collect', 'la', 'vega', 'la', 'vega', 'asp',
['stai', 'fantast', 'intern_hotel', 'la_vega', 'hotel', 'collect', 'la_vega', 'la_vega', 'asp']]
I am trying to pull out the bigrams from the last element so my output looks like
##output should look like output = ['stai', 'fantast', 'intern', 'hotel', 'la',
##'vega', 'hotel', 'collect', 'la', 'vega', 'la', 'vega', 'asp', intern_hotel',
##'la_vega', 'la_vega', 'la_vega']
The underscore seems to be giving me a hard time
substring = "_"
for item in output:
if substring not in item:
output.remove(item)
output
##returns ['fantast', 'intern_hotel', 'la_vega', 'collect', 'la_vega', 'la_vega'] instead of
What I am trying to do is just return the bigrams with underscores. Why is fantast and collect still in the list?