I want to extract data from html. The thing is, that i cant extract 2 of strings which are on the top, and on the bottom of my pattern.
I want to extract 23423423423
and 1234523453245
but only, if there is string Allan
between:
<h4><a href="/Profile/23423423423.html">@@@@@@</a> </h4> said12:49:32
</div>
<a href="javascript:void(0)" onclick="replyAnswer(@@@@@@@@@@,'GET','');" class="reportLink">
report </a>
</div>
<div class="details">
<p class="content">
Hi there, Allan.
</p>
<div id="AddAnswer1234523453245"></div>
Of course, i can do something like this: Profile\/(\d+).*\s*.*\s*.*\s*.*\s*.*\s*.*\s*.*\s*.*Allan.*\s*.*\s*.*AddAnswer(\d+)
. But the code is horrible. Is there any solution to make it shorter?
I was thinking about:
Profile\/(\d+)(.\sAllan)*AddAnswer(\d+)
or
Profile\/(\d+)(.*Allan\s*)*AddAnswer(\d+)
but none of wchich works properly. Do you have any ideas?