Remove specific characters from String List - Python

Question

I have a file from which I read a set of words, this file is "file1.txt".

The content for example of "file1.txt" file is the following:

Hello how are you? Very good!

What I have to do eliminate those symbolic characters that appear in the example.

For the previous example, the final phrase would be the following:

Hello how are you Very good

My idea was, once I have read all the words, store them in a list to apply the corresponding "replace" to remove all types of invalid characters.

Another idea that I thought, is when I load the .txt file directly apply the replace there, however after trying different ways I do not apply the deletion of the invalid characters.

Here is my code:

# -*- coding: utf-8 -*-

import sys 


def main():

  characters = '!?¿-.:;'
  aux = []

  with open('file1.txt','r') as f:
    for line in f:
      for word in line.split():
        aux.append(word)

  for a in aux:
    for character in characters:
      a = a.replace(character,"")

if __name__ == '__main__':
    main()

As you can see, the first part of my code stores in a list called 'aux' all the words from the txt file.

But I dont know how to apply "replace" method to eliminate the invalid characters from my words.

you have to work on aux[xxx] directly, or rebuild it using list comprehension. — Jean-François Fabre, Nov 02 '17 at 20:55

user2390182 · Answer 1 · 2017-11-02T20:58:45.923

1

You are just reassigning the loop variable, not mutating the list! Change the last loop to:

for i in range(len(aux)):
  for character in characters:
    # this actually changes the list element
    aux[i] = aux[i].replace(character, "")

You old version was roughly equivalent to:

for i in range(len(aux)):
  a = aux[i]
  for character in characters:
    a = a.replace(character, "") 
    # aux[i] is unimpressed ;)

edited Nov 02 '17 at 20:58

answered Nov 02 '17 at 20:55

user2390182

72,016
6
67
89

Omg totally true, thanks – fiticida Nov 02 '17 at 20:58

MaximTitarenko · Accepted Answer · 2017-11-02T22:22:38.780

1

It can be implemented much simpler by directly traversing the file and writing its content to a variable with filtering out unwanted characters.

For example, here is the 'file1.txt' file with the content:

Hello how are you? Very good!

Then we can do the following:

def main():

    characters = '!?¿-.:;'

    with open('file1.txt') as f:
        aux = ''.join(c for c in f.read() if c not in characters)

    # print(aux) # Hello how are you Very good

As we see aux is the file's content without unwanted chars and it can be easily edited based on the desired output format.

For example, if we want a list of words, we can do this:

def main():

    characters = '!?¿-.:;'

    with open('file1.txt') as f:
        aux = ''.join(c for c in f.read() if c not in characters)
        aux = aux.split()

    # print(aux) # ['Hello', 'how', 'are', 'you', 'Very', 'good']

edited Nov 02 '17 at 22:22

answered Nov 02 '17 at 21:01

MaximTitarenko

886
4
8

Yes is simpler, but I don't know why the solution of your idea gives to me, all the words separated by vowels, for example for "hello" word it gives to me " h" "e" "l" "l" "o" – fiticida Nov 02 '17 at 21:15
@fiticida, `aux` here is just a file's content without undesirable chars. It can be easily edited based on what kind of output you want. – MaximTitarenko Nov 02 '17 at 21:19
@fiticida, for example, if you want list of words, just add `aux = aux.split()` – MaximTitarenko Nov 02 '17 at 21:25
Like that? aux = aux.split().join(c for c in f.read() if c not in characters). Sorry I am new in python and lists miss me for now – fiticida Nov 02 '17 at 21:38
@fiticida, check the last edit. Also you can implement `split()` in 1 line, but in this case `split()` should be at the end: `aux = ''.join(c for c in f.read() if c not in characters).split()` – MaximTitarenko Nov 02 '17 at 21:45
Oh, perfect, thanks!!! – fiticida Nov 02 '17 at 21:56

Remove specific characters from String List - Python

2 Answers2