0

I'm working with email content that has been formatted in html. What I have found is that email addresses are often formatted similarly to html tags. Is there a way to selectively escape strings in html code, but render others as-is?

For example, email addresses are often formatted as "Guy, Some <someguy@gmail.com>" How can I escape this, which python sees as an html tag, but leave <br></br><p></p>, etc. intact and render them?

edited

I'm dealing with raw emails that have been preserved as text. In the following example, I want all of the html tags to render normally. However, it also tries to render email addresses that are stored in the following format someguy@gmail.com and then it gives me an error. So my challenge has been to find all of the email addresses, but leave the html tags alone.

<p>From: Guy, Some <someguy@gmail.com></p><br>
<br>
<p>Sent: Friday, January 21, 2022 2:16 PM</p><br>
<br>
<p>To: Another Guy <anotherguy@gmail.com></p>
<br>
<p>Subject: Really Important Subject</p>
<br>
<p> <br>Good morning,
<br>This is sample text<br> </p>
<br>
<p>Thanks for all your help!!!
<br>
<p> </p>
jvanheijzen
  • 99
  • 1
  • 8

1 Answers1

1

You can use html &lt; and &gt; to make <> inside html document if you're passing this email tags from django then you've to use safe so it will rendered as pure html code like this

"Guy, Some {{email|safe}}"

EDIT

before rendering your html you can extract all emails with <email> for example

import re

data = '''
<p>From: Guy, Some <someguy@gmail.com></p><br>
<br>
<p>Sent: Friday, January 21, 2022 2:16 PM</p><br>
<br>
<p>To: Another Guy <anotherguy@gmail.com></p>
<br>
<p>Subject: Really Important Subject</p>
<br>
<p> <br>Good morning,
<br>This is sample text<br> </p>
<br>
<p>Thanks for all your help!!!
<br>
<p> </p>
'''
emails_to_parse = re.findall('[A-z]+@[A-z]+.[A-z]+', data) # this will return ['someguy@gmail.com', 'anotherguy@gmail.com']

emails_to_remove = re.findall('<[A-z]+@[A-z]+.[A-z]+>', data) # this will return ['<someguy@gmail.com>', '<anotherguy@gmail.com>']

for i in emails_to_parse:
  for j in emails_to_remove:
    data = data.replace(j,i)
print(data)

above code gives this output

<p>From: Guy, Some someguy@gmail.com</p><br>
<br>
<p>Sent: Friday, January 21, 2022 2:16 PM</p><br>
<br>
<p>To: Another Guy someguy@gmail.com</p>
<br>
<p>Subject: Really Important Subject</p>
<br>
<p> <br>Good morning,
<br>This is sample text<br> </p>
<br>
<p>Thanks for all your help!!!
<br>
<p> </p>

I'll suggest to look at this post

Ankit Tiwari
  • 4,438
  • 4
  • 14
  • 41