Remove specific elements from list Python

Question

I want to remove from my list every \n that is before the element 'Ecrire'. It work just for the first case and not the other cases, And I really don't understand why Here is my code :

Corps2 = ['Debut', '\n', '\n', 'Note', ' ', '<-', ' ', 'Saisie()', ' ', '', '\n', '\n', 'Selon que\n', ' ', 'Note', ' ', 'â‰¥', ' ', '16', ' ', '', ' ', '', ' ', ':', ' ', '', '\n', '\n', 'Ecrire', ' ', "('TB')", '\n', '\n', '', ' ', 'Note', ' ', 'â‰¥', ' ', '14', ' ', '', ' ', '', ' ', ':', ' ', '', '\n', '\n', 'Ecrire', ' ', "('B')", '\n', '\n', '', ' ', 'Note', ' ', 'â‰¥', ' ', '12', ' ', '', ' ', '', ' ', ':', ' ', '', '\n', '\n', 'Ecrire', ' ', "('AB')", '\n', '\n', '', ' ', 'Note', ' ', 'â‰¥', ' ', '10', ' ', '', ' ', '', ' ', ':', ' ', '', '\n', '\n', 'Ecrire', ' ', "('Passable')", '\n', '\n', 'Sinon', ' ', ':', ' ', 'Ecrire', ' ', "('Redoublant')", '\n', '\n', 'Fin_Si']
for i in Corps2:
    if i =='Ecrire' and Corps2[Corps2.index('Ecrire')-2 :Corps2.index('Ecrire')]==['\n','\n'] :
        del Corps2[Corps2.index('Ecrire')-2 :Corps2.index('Ecrire')]

Welcome to SO! Check out the [tour]. Your code is hard to understand, since there's a lot of irrelevant data and since `Corps2.index('Ecrire')` is repeated so much. Please provide a [mre], which will also mean providing your expected output and actual output. — wjandrea, Jun 14 '20 at 16:35
Also you say you want to remove `'\n'`, but it looks like your code will remove more than one item — wjandrea, Jun 14 '20 at 16:37
Hint: `Corps2.index('Ecrire')` will only return the first occurrence. Try `for i, word in enumerate(Corps2): ...`. — Han-Kwang Nienhuys, Jun 14 '20 at 16:39

Mark Tolonen · Answer 1 · 2020-06-14T16:57:26.290

Two problems: modifying a list while iterating over it, and .index only finds the first item.

Below finds all the locations to delete, then deletes them in reverse order so the indices don't point to the wrong element, which is what happens if you delete in the forward diredction:

Corps2 = ['Debut', '\n', '\n', 'Note', ' ', '<-', ' ', 'Saisie()', ' ', '', '\n', '\n', 'Selon que\n', ' ', 'Note', ' ', 'â‰¥', ' ', '16', ' ', '', ' ', '', ' ', ':', ' ', '', '\n', '\n', 'Ecrire', ' ', "('TB')", '\n', '\n', '', ' ', 'Note', ' ', 'â‰¥', ' ', '14', ' ', '', ' ', '', ' ', ':', ' ', '', '\n', '\n', 'Ecrire', ' ', "('B')", '\n', '\n', '', ' ', 'Note', ' ', 'â‰¥', ' ', '12', ' ', '', ' ', '', ' ', ':', ' ', '', '\n', '\n', 'Ecrire', ' ', "('AB')", '\n', '\n', '', ' ', 'Note', ' ', 'â‰¥', ' ', '10', ' ', '', ' ', '', ' ', ':', ' ', '', '\n', '\n', 'Ecrire', ' ', "('Passable')", '\n', '\n', 'Sinon', ' ', ':', ' ', 'Ecrire', ' ', "('Redoublant')", '\n', '\n', 'Fin_Si']
to_delete = [i for i,v in enumerate(Corps2) if v == 'Ecrire']
for i in reversed(to_delete):
    del Corps2[i-1]

Note if you process the string before tokenizing it, you could just do a .replace('\nEcrire','Ecrire') first.

FYI, the element 'â‰¥' indicates the string was decoded incorrectly:

>>> 'â‰¥'.encode('cp1252').decode('utf8')
'≥'

tripleee · Accepted Answer · 2020-06-14T16:59:20.317

The index call will always return the first instance of the string. This is one of those situations where yor really want to loop over the indices of the list rather than directly loop over its elements.

Notice also that you can't del elements from the list you are currently traversing; but of course, when you loop over an indirect index, you can, as long as you termrnate on any IndexError.

for idx in range(len(Corps2)-1):
    try:
        if Corps2[idx] == '\n' and Corps2[idx+1] == 'Ecrire:
            del Corps2[idx]
    except IndexError:
         break

Demo: https://ideone.com/LhEvUB

You should understand how the IndexError could happen - you are shortening the list for each deleted element, and so the calculated ending index will overshoot the list's end by that many items. Also, by lucky coincidence, we already know that the element which replaces the '\n' will never also be '\n' (namely, because it will be 'Ecrire') so we can conveniently avoid the required complications if this were not the case.

Tangentially, you should conventionally not capitalize the names of regular variables in Python; capitalized names are usually class names.

deleting elements from a list while iterating over it doesn't work so well either. — Mark Tolonen, Jun 14 '20 at 16:44
@MarkTolonen In the general case no, but if you monotonously proceed through the list and avoid index errors and understand that you will skip over an element when you delete, it's possible. Here skipping the "Ecrire" element could even be regarded as a feature. — tripleee, Jun 14 '20 at 16:48

score 0 · Answer 3 · answered Jun 14 '20 at 17:03

0

This single line will do the thing you need:

Corps2='|'.join(Corps2).replace('|\n|\n|Ecrire','Ecrire').split('|')

answered Jun 14 '20 at 17:03

Sandeep Kothari

405
3
6

score 0 · Answer 4 · answered Jun 14 '20 at 18:12

Check this out too. Index changes occur whenever it deletes '\n' from array.

Corps2 = ['Debut', '\n', '\n', 'Note', ' ', '<-', ' ', 'Saisie()', ' ', '', '\n', '\n', 'Selon que\n', ' ', 'Note', ' ',
          'â‰¥', ' ', '16', ' ', '', ' ', '', ' ', ':', ' ', '', '\n', '\n', 'Ecrire', ' ', "('TB')", '\n', '\n', '',
          ' ', 'Note', ' ', 'â‰¥', ' ', '14', ' ', '', ' ', '', ' ', ':', ' ', '', '\n', '\n', 'Ecrire', ' ', "('B')",
          '\n', '\n', '', ' ', 'Note', ' ', 'â‰¥', ' ', '12', ' ', '', ' ', '', ' ', ':', ' ', '', '\n', '\n', 'Ecrire',
          ' ', "('AB')", '\n', '\n', '', ' ', 'Note', ' ', 'â‰¥', ' ', '10', ' ', '', ' ', '', ' ', ':', ' ', '', '\n',
          '\n', 'Ecrire', ' ', "('Passable')", '\n', '\n', 'Sinon', ' ', ':', ' ', 'Ecrire', ' ', "('Redoublant')",
          '\n', '\n', 'Fin_Si']
Ecrire_count = Corps2.count("Ecrire")
for counter in range(Ecrire_count - 1):
    for i in range(len(Corps2)):
        if Corps2[i+2] == 'Ecrire' and Corps2[i+1] == '\n' and Corps2[i] == '\n':
            del Corps2[i:i+2]
            break

print(Corps2)

Remove specific elements from list Python

4 Answers4