0

I'm creating a list from an e-mail body and the output list contains thousands of '\n' and '\t' characters. I would like to know how to remove them from a Python list.

NOTE: I'm expecting to remove only the items that have no other values, since there are some index in the list that contains valuable information and special characters at the end I don't want to remove them. For example: [..., '\n', 'WDO\n', '\t\t\n', ...]

outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6) 
messages = inbox.Items
message = messages.GetFirst()

for msg in messages:
    subject = msg.Subject
    if (subject.find('ABCDEFG') != -1):
        assunto = subject
        conteudo = msg.body
        File_object = open(r"C:\Projetos\E-mail\body.txt","w")
        File_object.write(msg.body)
        File_object.close() 
        break

File_object = open(r"C:\Projetos\E-mail\body.txt","r")
lista = File_object.readlines()

And this is the output of the list:

lista[0:16]

['Segue dados de hoje. \n',
 '\n',
 '\n',
 '\n',
 ' \n',
 '\n',
 '\n',
 '\n',
 'WDO\n',
 '\n',
 '\n',
 '\n',
 '\t\t\n',
 '\n',
 'Média\n',
 '\n']
AMC
  • 2,642
  • 7
  • 13
  • 35
  • Can you post expected output too? I'm assuming you only want a list of `Segue dados de hoje.`, `WDO`, and `Média` – ctwheels Jun 26 '20 at 19:19
  • What is the issue, exactly? Have you tried anything, done any research? – AMC Jun 26 '20 at 20:26

2 Answers2

4

Use isspace to determine if the list element is just whitespace, eg, tab, newline, etc.

[x for x in lista if not x.isspace()]

Output:

['Segue dados de hoje. \n', 'WDO\n', 'Média\n']

If you also want to remove the newline character and other white space at the end, use rstrip:

[x.rstrip() for x in lista if not x.isspace()]
busybear
  • 10,194
  • 1
  • 25
  • 42
1

listb = [x for x in lista if (x.strip() != "")]

Zachary Oldham
  • 838
  • 1
  • 5
  • 21