I need to extract the text between the and tag using regex in python.
Example: Customizable:<strong>Features Windows 10 Pro</strong> and legacy ports <b>including VGA,</b> HDMI, RJ-45, USB Type A connections.
For this i am doing:
pattern=re.compile("(<b>(.*?)</b>)|(<strong>(.*?)</strong>)")
for label in labels:
print(label)
flag=0
if(('Window'in label or 'Windows' in label) and ('<b>' in label or '<strong>' in label)):
text=re.findall(pattern, label)
print(text)
where labels is the list of such html elements containing tag.
The output expected is ['Features Windows 10','including VGA,']
Instead in a getting the outuput as: [('', 'Features Windows 10 Pro'), ('including VGA,', '')]
Please help. Thanks in advance.