I want to filter automated texts out of e-mail messages. These are lines such as this one:
If you receive this email in error, please send it back to us immediately \r\n and permanently delete it and do not use, copy or disclose the content of the email or any attachment.
For this I have created a list of these sentences, and filter them as such:
def remove_redundant_text(body):
for i in filter_lists.body_filter_list:
body = body.replace(i, "")
return body
However, this doesn't work because the newlines and other escaped characters that appear randomly in the text, like the \r\n in the example. How do I make the .replace()
ignore these?
Let me give input and desired output example.
input = {'description': "\n\nYes have tried this along with all other combinations but nothing working – just said to contact helpdesk with issues?\n\xa0\n\xa0\n\n\n\n\n\nKirstin Box\n\n\n\n\nSales Force Effectiveness – Wholesale, Workplace, Institutions & Leisure\n\n\n\n\nE. \n\n\n\xa0\n\n\n\xa0\n\n\n\n\nM. \n\n\n\xa0\n\n\n\xa0\n\n\n\n\n\xa0\n\n\n\xa0\n\n\n\xa0\n\n\n\n\n\n\n\n\xa0\n\n\n\xa0\n\n\n\n\n\xa0\n\n\n\xa0\n\n\n\xa0\n\n\n\n\nWe work flexibly at Coca-Cola European Partners. I'm sending this message now because it suits me, but I don't expect you to read, respond or action it outside\r\n of your regular hours.\n\xa0\nCustomer HUB Phone: 0808 1 000 000\nCustomer HUB Email:\r\nconnect@ccep.com\nCustomer HUB Website:\r\nwww.cokecustomerhub.co.uk\n\xa0\nThe information in this email (including any attachments) is intended solely for the addressee(s) and is confidential. It may be read, copied and used only by the\r\n intended recipient. If you receive this email in error, please send it back to us immediately and permanently delete it and do not use, copy or disclose the content of the email or any attachment. Subject to national laws, Coca-Cola European Partners may process\r\n and monitor email content and traffic data for the purposes of security and compliance with corporate policies and applicable laws.\n\xa0\nPLEASE RESPECT THE ENVIRONMENT: Think twice before printing this e-mail.\n\n\n\n\n\xa0\n\n\xa0\n\n\nFrom: BPT Service Desk\r\n\nSent: 26 June 2019 13:15\nTo: Kirstin Box <....>\nSubject: RE: Internet Access\n\n\n\xa0\nHello, Kirstin.\n\xa0\n\xa0\nDid you try the combination\xa0\r\nbxxxxxx@cokecce.com ?\n\xa0\n\xa0\nBest Regards"}
output = {'description': "\n\nYes have tried this along with all other combinations but nothing working – just said to contact helpdesk with issues\n\nFrom: BPT Service Desk\r\n\nSent: 26 June 2019 13:15\nTo: Kirstin Box <....>\nSubject: RE: Internet Access\n\n\n\xa0\nHello, Kirstin.\n\xa0\n\xa0\nDid you try the combination\xa0\r\nbxxxxxx@cokecce.com ?\n\xa0\n\xa0\nBest Regards"}
body_filter_list = ["We work flexibly at Coca-Cola European Partners. I'm sending this message now because it suits me, but I don't expect you to read, respond or action it outside of your regular hours.",
"The information in this email (including any attachments) is intended solely for the addressee(s) and is confidential. It may be read, copied and used only by the intended recipient.",
"If you receive this email in error, please send it back to us immediately and permanently delete it and do not use, copy or disclose the content of the email or any attachment. ",
"Subject to national laws, Coca-Cola European Partners may process and monitor email\r\n content and traffic data for the purposes of security and compliance with corporate policies and applicable laws.",
"Customer HUB Phone: 0808 1 000 000\nCustomer HUB Email:\r\nconnect@ccep.com\nCustomer HUB Website:\r\nwww.cokecustomerhub.co.uk",
"The information in this email (including any attachments) is intended solely for the addressee(s) and is confidential. It may be read, copied and used only by the\r\n intended recipient. If you receive this email in error, please send it back to us immediately and permanently delete it and do not use, copy or disclose the content of the email or any attachment. Subject to national laws, Coca-Cola European Partners may process\r\n and monitor email content and traffic data for the purposes of security and compliance with corporate policies and applicable laws.",
"PLEASE RESPECT THE ENVIRONMENT: Think twice before printing this e-mail.",
"Este correo electrónico ha sido enviado en nombre del grupo de empresas de Coca-Cola European Partners.\r\nPulse en el siguiente enlace para ver esta leyenda informativa en English, Français, Nederlands, Norsk, Svenska, Deutsch, Español and Português.\n\r\nLa información contenida en este correo electrónico (incluidos los archivos adjuntos) está destinada exclusivamente a su destinatario (s) y es confidencial. Puede ser leída, copiada y utilizada solamente por su destinatario. Si recibe este mensaje por error,\r\n por favor, envíelo de nuevo, inmediatamente, al remitente, elimínelo permanentemente y no utilice, copie o divulgue el contenido del correo electrónico ni de cualquier archivo adjunto.\n\r\nSiempre de conformidad con la legislación nacional aplicable, las empresas de Coca-Cola European Partners, podrán procesar y monitorizar el contenido de correo electrónico y del tráfico de datos con fines de seguridad y cumplimiento de las políticas corporativas\r\n y de la normativa aplicable.\n\r\nPOR FAVOR RESPETE EL MEDIO AMBIENTE: reconsidere la necesidad de imprimir este correo electrónico antes de hacerlo. La protección medioambiental es responsabilidad de todos.",
"This email was sent on behalf of the Coca-Cola European Partners group of companies.",
"Click here to see our email disclaimer in English, Français, Nederlands, Norsk, Svenska, Deutsch, Español and Português.",
"The information in this email (including any attachments) is intended solely for the addressee(s) and is confidential. It may be read, copied and used only by the intended recipient. If you receive this email in error, please send it back to us immediately\r\n and permanently delete it and do not use, copy or disclose the content of the email or any attachment.\n\r\nSubject to national laws, Coca-Cola European Partners may process and monitor email content and traffic data for the purposes of security and compliance with corporate policies and applicable laws.\n\r\nPLEASE RESPECT THE ENVIRONMENT: Think twice before printing this e-mail. Environmental protection is in our hands."]