I can't answer your question directly as I've never used Beautiful Soup (so do NOT accept this answer!) but just want to remind you if the pages are all pretty simple, an alternative option might be to write your own parser using .split()
?
This is rather clumsy, but worth considering if pages are simple/predictable...
That is, if you know something about the overall layout of the page
(e.g., user email is always first email mentioned) you could write your own parser, to find the bit before and after the '@' sign
# html = the entire document as a string
# return the entire document up to the '@' sign
bit_before_at_sign = html.split('@')[0]
# only useful if you know first email is the one you care about
# you could then cut out everything before username with something like this
b = bit_before_at_sign
# a very long string, we just want the last bit right before the @ sign
username = b.split(' ')[-1].split('\n')[-1].split('\r')[-1].split('\r')[-1].split(';')[-1]
# add more if required, depending on how the html looks to you
# (I've just guessed some html elements that might precede the username)
# you could similarly parse the bit after the @ sign,
# html.split('@')[1]
# e.g., checking the first few characters of this
# against a known list of .tlds like '.com', '.co.uk', etc
# (remember some TLDs have more than one period, so don't just parse by '.')
# and combine with the username you already know
Also at your disposal, in case you want to narrow down which bit of the document you focus on:
In case you want to make sure the word 'e-mail' is also in the string you're parsing
if 'email' in lower(b) or 'e-mail' in lower(b):
# do something...
To check where in the document the @ symbol first appears
html.index('@')
# e.g., if you want to see how near this '@' symbol is to some other element you know about
# such as the word 'e-mail', or a particular div element or '</strong>'
To confine your search for an email to the 300 characters before/after another element you know about:
startfrom = html.index('</strong>')
html_i_will_search = html[startfrom:startfrom+300]
I imagine a few minutes more on Google may alternatively prove useful; your task doesn't sound unusual :)
And make sure you consider cases where there are multiple email addresses on the page (e.g., so you don't assign support@site.com to every user!)
Whatever method you go with, if you have doubts, might be worth checking your answer using email.utils.parseaddr() or someone else's regex checker. See previous question