Unidecode In List

Question

I have the following code: This goes through a directory with a file and reads it, the stopwords and the dots are removed, I also want to remove the accents but I get an error: 'list' object has no attribute 'encode'

path = 'C:\\Users\\Example\\Desktop\\Example\\x'
ficheros = os.listdir(path)

docu = []

for namefi in fi:
    if os.path.isfile(os.path.join(path, namefi)): 
        fich = open(os.path.join(path, namefi), "r",encoding='utf-8')
        text = fich.read()
        documentos.append(text)
        tokens=word_tokenize(text)
        clean=[w.lower() for w in tokens if not w in stopwords]
        
        n_p= [item.replace('.','') for item in clean]
        n_a= unidecode.unidecode(n_p)
        
                           
print(n_p)

I have used unidecode.unidecode but it seems to be invalid with lists.

"but it seems to be invalid with lists" You cannot expect to use *anything* that was designed to work on an X, by giving it a list of Xs and expecting it to do its thing to each element. *You* have to write the code that says "do this for each element:". The only sort-of exceptions are using number-crunching libraries like Numpy or Pandas, which are specifically designed for that kind of application (called *broadcasting*), using their own data types. — Karl Knechtel, Feb 13 '22 at 14:57
You *already know how to do this*: consider for example the difference between `[w.lower() for w in tokens]` (correct) and `tokens.lower()` (not correct). — Karl Knechtel, Feb 13 '22 at 14:58
If I have `x = [1, 2, 3]`, and then I write `x + 5`, do you expect that the result should be `[6, 7, 8]`? No? Why not? What code should I write instead? It would look something like `[n + 5 for n in x]`, right? The same rule applies to **everything else**, including `unidecode.unidecode`. You have a **list of** strings, and you want to apply `unidecode.unidecode` **to each** string, **not** to the list. — Karl Knechtel, Feb 13 '22 at 15:10

Unidecode In List

0 Answers0