I need regex for extracting the text from the following tag: I am using Python & BeautifulSoup
<h4 style="color:#000000; line-height:20px; font-size:18px; margin-left:22px;
overflow:auto; content:inherit; padding:10px; font-family:"Book Antiqua",
Palatino, serif;">THE TEXT TO BE EXTRACTED IS HERE</h4></div><br /></div>
I tried the following:
stylecontent = 'color:#000000; line-height:20px; font-size:18px; margin-left:22px;
overflow:auto; content:inherit; padding:10px; font-family:"Book Antiqua",
Palatino, serif;'
soup = BeautifulSoup(br.response().read(), "lxml")
scrap_soup = soup.findAll('h4', {'style': stylecontent})
but It doesn't works always as the website keeps changing stylecontent
.
Now I want to use regex:
soup.find_all(re.compile("some_foo_regex")):
I am interested in that some_foo_regex
.
Thanks.
GET THIS TEXT
` I want all the text in h4 tag which has style element. – Aniket Vij Aug 24 '15 at 14:17