concatenate words separated by a token in a list

Question

I want to concatenate a word separated by an asterisk in a list of words. The code I am trying is:

import nltk
from nltk.tokenize import word_tokenize
import re

words = ['les','engage', '*', 'ment', 'de','la']

with open ('Fr-dictionary.txt') as fr:
    dic = word_tokenize(fr.read().lower())

l=[ ]
errors=[ ]

for n,word in enumerate (words):
    l.append(word)
    if word == "*":
        print(words[n-1], words[n+1])
        exp = words[n-1] + words[n+1]
        if exp in dic:  
            l.append(exp)
            errors.append(words[n-1])
            errors.append("*")
            errors.append(words[n+1])
        else:
            continue

print(l)
print(errors)


l=frozenset(l)
errors=frozenset(errors)

c=l.difference(errors)

print(list(c))

My output is:

['la', 'les', 'de', 'engagement']

But my desired output has to be in the same order of the original list without:

['les','engagement', 'de','la']

Is there any other way to get the desired output?

Tried the [join()](https://www.geeksforgeeks.org/join-function-python/) function? It can take delimiters. You can do without all the loops. — Irfanuddin, Mar 16 '19 at 18:12
Sets are not ordered, so you can't expect any specific order by using them. — Thierry Lathuille, Mar 16 '19 at 18:13
@IrfanuddinShafi, After appending the concatenated words, I want to remove the first part of the word, the asterisk and the second part of the word: "engage", "*", "ment" — Nadia Santos, Mar 16 '19 at 18:18
@IrfanuddinShafi, how to convert sets to collections? Can you show? — Nadia Santos, Mar 16 '19 at 18:20
Read the answer given [here](https://stackoverflow.com/questions/9792664/converting-a-list-to-a-set-changes-element-order) — Irfanuddin, Mar 16 '19 at 18:22

score 0 · Answer 1 · answered Mar 16 '19 at 19:12

0

try this

    while "*" in words:
        index = words.index("*")
        words.pop(index)
        words.insert(index,words.pop(index-1)+words.pop(index-1))

answered Mar 16 '19 at 19:12

elisha robinson

68
1
8

concatenate words separated by a token in a list

1 Answers1