1

I have a Gensim bigram output that I need to append to a list. The actual output is

output = ['stai', 'fantast', 'intern', 'hotel', 'la', 'vega', 'hotel', 'collect', 'la', 'vega', 'la', 'vega', 'asp', 
 ['stai',  'fantast',  'intern_hotel',  'la_vega',  'hotel',  'collect',  'la_vega',  'la_vega',  'asp']]

I am trying to pull out the bigrams from the last element so my output looks like

##output should look like output = ['stai', 'fantast', 'intern', 'hotel', 'la',
##'vega', 'hotel', 'collect', 'la', 'vega', 'la', 'vega', 'asp', intern_hotel',
##'la_vega',  'la_vega',  'la_vega']  

The underscore seems to be giving me a hard time

    substring = "_"

    for item in output:
        if substring not in item:
            output.remove(item)
    output
##returns ['fantast', 'intern_hotel', 'la_vega', 'collect', 'la_vega', 'la_vega'] instead of 

What I am trying to do is just return the bigrams with underscores. Why is fantast and collect still in the list?

  • 1
    Don't delete items from the list while iterating over it. Instead, filter it such as `[e for e in output if substring in e]` – dawg Mar 27 '22 at 23:18

0 Answers0