I have the following text:
"It's the show your only friend and pastor have been talking about!
<i>Wonder Showzen</i> is a hilarious glimpse into the black
heart of childhood innocence! Get ready as the complete first season of MTV2's<i> Wonder Showzen</i> tackles valuable life lessons like birth,
nature, diversity, and history – all inside the prison of
your mind! Where else can you..."
What I want to do with this is remove the html tags and encode it into unicode. I am currently doing:
def remove_tags(text):
return TAG_RE.sub('', text)
Which only strips the tag. How would I correctly encode the above for database storage?