I have a HTML file and I want to loop through the content and remove all the attributes in the tags and only display the tags. for example:
<div class="content"><div/>
<div id="content"><div/>
<p> test</p>
<h1>tt</h1>
the output should be:
<div></div>
<div></div>
<p> </p>
<h1></h1>
At the moment I can display all tags with all the attributes, but I only want to display the tags without the attributes.
import re
file = open('myfile.html')
readtext = file.read()
lines = text.splitlines()
tags = re.findall(r'<[^>]+>',readtext)
for data in tags:
print(a)
` tag has a space?
– Roland Illig Jul 20 '19 at 20:47