4

How to replace links with anchors in html (python)?

for example input:

 <p> Hello <a href="http://example.com">link text1</a> and <a href="http://example.com">link text2</a> ! </p>

i want at result with saved p tag (just a tag remove):

<p>
Hello link text1 and link text2 ! 
</p>
userlond
  • 3,632
  • 2
  • 36
  • 53
Evg
  • 2,978
  • 5
  • 43
  • 58

3 Answers3

5

You could do this with a simple regex and the sub function:

import re

text = '<p> Hello <a href="http://example.com">link text1</a> and <a href="http://example.com">link text2</a> ! </p>'
pattern =r'<(a|/a).*?>'

result = re.sub(pattern , "", text)

print result
'<p> Hello link text1 and link text2 ! </p>'

This code replaces all occuring <a..> and </a> tags with an empty string.

miindlek
  • 3,523
  • 14
  • 25
3

Looks like a perfect case for BeautifulSoup's unwrap() method:

from bs4 import BeautifulSoup
data = '''<p> Hello <a href="http://example.com">link text1</a> and <a href="http://example.com">link text2</a> ! </p>'''
soup = BeautifulSoup(data)
p_tag = soup.find('p')
for _ in p_tag.find_all('a'):
    p_tag.a.unwrap()
print p_tag

This gives:

<p> Hello link text1 and link text2 ! </p>
shaktimaan
  • 11,962
  • 2
  • 29
  • 33
0

You can use Parser Library for it.. like BeautifulSoup and other also. I am not sure for it, but you can get something here

Community
  • 1
  • 1
Nitin
  • 23
  • 1
  • 6