So I replace the the link with the text of the link
text = re.sub('<a href=\".*?\">(.*?)</a>','\\1',text)
example:
>>>text="<a href="SOME URL">SOME URL</a>"
>>>text = re.sub('<a href=\".*?\">(.*?)</a>','\\1',text)
>>>print text
SOME URL
I would like it to output some_url
but adding .lower().replace(' ','_') doesn't help
>>>text = re.sub('<a href=\".*?\">(.*?)</a>','\\1'.lower().replace(' ','_'),text)
SOME URL