1

my code has to take the file and put all the words in a list, and then check if they are not repeated, if necessary delete them until it is only 1. However, only the "the" puts it several times enter code here:

    plbrs_totales = list()
    fnombre = input("Ingrese nombre de archivo:\n")
    try:
      llave = open(fnombre)
    except:
      print("Error")
      quit()
    for lineas in llave:
      lineas_ind= lineas.rstrip()
      plbrs = lineas_ind.split()
      plbrs_totales = plbrs_totales + plbrs
    for rep in plbrs_totales:
      repwords = plbrs_totales.count(rep)
      if repwords > 1:
       plbrs_totales.remove(rep)
    plbrs_totales.sort()
    print(plbrs_totales)

y este es el archivo.txt:

But soft what light through yonder window breaks\n
It is the east and Juliet is the sun\n
Arise fair sun and kill the envious moon\n
Who is already sick and pale with grief\n

and this is the output, why are 2 "the"?

['Arise', 'But', 'It', 'Juliet', 'Who', 'already', 'and', 'breaks', 'east', 'envious', 'fair', 'grief', 'is', 'kill', 'light', 'moon', 'pale', 'sick', 'soft', 'sun', 'the', 'the', 'through', 'what', 'window', 'with', 'yonder']

Dharman
  • 30,962
  • 25
  • 85
  • 135

1 Answers1

0

That's because you are modifying the list you are iterating over. This should answer the why.

As to how you could do it, you would want to replace the second for loop with this.

plbrs_totales = list(set(plbrs_totales))
plbrs_totales.sort()
print(plbrs_totales)

The set method will return a set with each item in plbrs_totales occuring once while the list method converts the set back to a list.

Nerveless_child
  • 1,366
  • 2
  • 15
  • 19
  • i understand now, but why the only element that prints bad is "the", if you see "and" its works well. is this becouse i modify the list? – José_Ricardo Feb 19 '21 at 18:39
  • Yes. There are only 3 **the** in your text, so I'm guessing that the `for` loop only ever got to see one hence only one was removed. – Nerveless_child Feb 19 '21 at 21:35