I have the following perl code:
# $content is the text of a webpage
while ($content =~ /rgRow.*?<td>(.*?)<\/td><td.*?>(.*?)<\/td><td.*?>(.*?)<\/td><td.*?>.*?<\/td><td.*?>(.*?)<\/td><td.*?><nobr>(.*?)<\/nobr><\/td>/sg) {
# do stuff
}
I have worked out that the code is hanging at this regex call. It gets about 2-3 iterations into the while loop and then it just hangs. I have left it for about 30 mins and it has not proceeded.
What could be the problem?
The purpose of the code is to go through some HTML and extract some data out of it.
Here is the HTML that I am setting $content
to:
<tbody>
<tr class="rgRow InnerItemStyle" id="ctl00_PlaceHolderMain_radResultsGrid_ctl00__0">
<td>CONSIDERATION OF REPORTS SUBMITTED BY STATES PARTIES UNDER ARTICLE 9 OF THE CONVENTION : SECOND PERIODIC REPORT OF STATES PARTIES DUE IN 1974 / MOROCCO</td><td>State party's report</td><td>CERD</td><td>Morocco</td><td>CERD/C/R.65/Add.1</td><td><nobr>21 Feb 1974</nobr></td><td>
<a id="ctl00_PlaceHolderMain_radResultsGrid_ctl00_ctl04_MoreDocs" title="View document" href="http://tbinternet.ohchr.org/_layouts/treatybodyexternal/Download.aspx?symbolno=CERD%2fC%2fR.65%2fAdd.1&Lang=en" target="_blank" style="text-decoration:underline;">View document</a>
</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">E</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">CERD/C/R.65/Add.1</td><td style="display:none;"> </td><td style="display:none;">True</td>
</tr><tr class="rgRow InnerAlernatingItemStyle" id="ctl00_PlaceHolderMain_radResultsGrid_ctl00__1">
<td>CONSIDERATION OF REPORTS SUBMITTED BY STATES PARTIES UNDER ARTICLE 9 OF THE CONVENTION : INITIAL REPORTS OF STATES PARTIES WHICH ARE DUE IN 1972 / MOROCCO</td><td>State party's report</td><td>CERD</td><td>Morocco</td><td>CERD/C/R.33/Add.1</td><td><nobr>17 Jan 1972</nobr></td><td>
<a id="ctl00_PlaceHolderMain_radResultsGrid_ctl00_ctl06_MoreDocs" title="View document" href="http://tbinternet.ohchr.org/_layouts/treatybodyexternal/Download.aspx?symbolno=CERD%2fC%2fR.33%2fAdd.1&Lang=en" target="_blank" style="text-decoration:underline;">View document</a>
</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">E</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">CERD/C/R.33/Add.1</td><td style="display:none;"> </td><td style="display:none;">True</td>
</tr><tr class="rgRow InnerItemStyle" id="ctl00_PlaceHolderMain_radResultsGrid_ctl00__2">
<td>Annex I to ALGERIA's Report</td><td>Annex to State party report</td><td>CERD</td><td>Algeria</td><td> </td><td> </td><td>
<a id="ctl00_PlaceHolderMain_radResultsGrid_ctl00_ctl08_MoreDocs" title="View document" href="http://tbinternet.ohchr.org/_layouts/treatybodyexternal/Download.aspx?symbolno=INT%2fCERD%2fAIS%2fDZA%2f13691&Lang=en" target="_blank" style="text-decoration:underline;">View document</a>
</td><td style="display:none;">E</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">INT_CERD_AIS_DZA_13691_E.doc</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">INT/CERD/AIS/DZA/13691</td><td style="display:none;"> </td><td style="display:none;">True</td>
</tr><tr class="rgRow InnerAlernatingItemStyle" id="ctl00_PlaceHolderMain_radResultsGrid_ctl00__3">
<td>Annex II to ALGERIA's report</td><td>Annex to State party report</td><td>CERD</td><td>Algeria</td><td> </td><td> </td><td>
<a id="ctl00_PlaceHolderMain_radResultsGrid_ctl00_ctl10_MoreDocs" title="View document" href="http://tbinternet.ohchr.org/_layouts/treatybodyexternal/Download.aspx?symbolno=INT%2fCERD%2fAIS%2fDZA%2f13692&Lang=en" target="_blank" style="text-decoration:underline;">View document</a>
</td><td style="display:none;">E</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">INT_CERD_AIS_DZA_13692_E.doc</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">INT/CERD/AIS/DZA/13692</td><td style="display:none;"> </td><td style="display:none;">True</td>
</tr><tr class="rgRow InnerItemStyle" id="ctl00_PlaceHolderMain_radResultsGrid_ctl00__4">
<td>Annex III to ALGERIA's report</td><td>Annex to State party report</td><td>CERD</td><td>Algeria</td><td> </td><td> </td><td>
<a id="ctl00_PlaceHolderMain_radResultsGrid_ctl00_ctl12_MoreDocs" title="View document" href="http://tbinternet.ohchr.org/_layouts/treatybodyexternal/Download.aspx?symbolno=INT%2fCERD%2fAIS%2fDZA%2f13693&Lang=en" target="_blank" style="text-decoration:underline;">View document</a>
</td><td style="display:none;">E</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">INT_CERD_AIS_DZA_13693_E.doc</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">INT/CERD/AIS/DZA/13693</td><td style="display:none;"> </td><td style="display:none;">True</td>
</tr><tr class="rgRow InnerAlernatingItemStyle" id="ctl00_PlaceHolderMain_radResultsGrid_ctl00__5">
<td>CERD-C-NZ-18-20_Annexes</td><td>Annex to State party report</td><td>CERD</td><td>New Zealand</td><td> </td><td> </td><td>
<a id="ctl00_PlaceHolderMain_radResultsGrid_ctl00_ctl14_MoreDocs" title="View document" href="http://tbinternet.ohchr.org/_layouts/treatybodyexternal/Download.aspx?symbolno=INT%2fCERD%2fADR%2fNZL%2f13731&Lang=en" target="_blank" style="text-decoration:underline;">View document</a>
</td><td style="display:none;">E</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">INT_CERD_ADR_NZL_13731_E.doc</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">INT/CERD/ADR/NZL/13731</td><td style="display:none;"> </td><td style="display:none;">True</td>
</tr><tr class="rgRow InnerItemStyle" id="ctl00_PlaceHolderMain_radResultsGrid_ctl00__6">
<td>CERD.C.RUS.20-22_Annex1</td><td>Annex to State party report</td><td>CERD</td><td>Russian Federation</td><td> </td><td> </td><td>
<a id="ctl00_PlaceHolderMain_radResultsGrid_ctl00_ctl16_MoreDocs" title="View document" href="http://tbinternet.ohchr.org/_layouts/treatybodyexternal/Download.aspx?symbolno=INT%2fCERD%2fADR%2fRUS%2f13732&Lang=en" target="_blank" style="text-decoration:underline;">View document</a>
</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">R</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">INT_CERD_ADR_RUS_13732_R.doc</td><td style="display:none;">INT/CERD/ADR/RUS/13732</td><td style="display:none;"> </td><td style="display:none;">True</td>
</tr><tr class="rgRow InnerAlernatingItemStyle" id="ctl00_PlaceHolderMain_radResultsGrid_ctl00__7">
<td>Annex to State party report</td><td>Annex to State party report</td><td>CERD</td><td>Poland</td><td> </td><td> </td><td>
<a id="ctl00_PlaceHolderMain_radResultsGrid_ctl00_ctl18_MoreDocs" title="View document" href="http://tbinternet.ohchr.org/_layouts/treatybodyexternal/Download.aspx?symbolno=INT%2fCERD%2fADR%2fPOL%2f15432&Lang=en" target="_blank" style="text-decoration:underline;">View document</a>
</td><td style="display:none;">E</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">INT_CERD_ADR_POL_15432_E.doc</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">INT/CERD/ADR/POL/15432</td><td style="display:none;"> </td><td style="display:none;">True</td>
</tr><tr class="rgRow InnerItemStyle" id="ctl00_PlaceHolderMain_radResultsGrid_ctl00__8">
<td>Annexe X</td><td>Annex to State party report</td><td>CERD</td><td>Belgium</td><td> </td><td> </td><td>
<a id="ctl00_PlaceHolderMain_radResultsGrid_ctl00_ctl20_MoreDocs" title="View document" href="http://tbinternet.ohchr.org/_layouts/treatybodyexternal/Download.aspx?symbolno=INT%2fCERD%2fADR%2fBEL%2f15561&Lang=en" target="_blank" style="text-decoration:underline;">View document</a>
</td><td style="display:none;"> </td><td style="display:none;">F</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">INT_CERD_ADR_BEL_15561_F.pdf</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">INT/CERD/ADR/BEL/15561</td><td style="display:none;"> </td><td style="display:none;">True</td>
</tr><tr class="rgRow InnerAlernatingItemStyle" id="ctl00_PlaceHolderMain_radResultsGrid_ctl00__9">
<td>Annexe XI</td><td>Annex to State party report</td><td>CERD</td><td>Belgium</td><td> </td><td> </td><td>
<a id="ctl00_PlaceHolderMain_radResultsGrid_ctl00_ctl22_MoreDocs" title="View document" href="http://tbinternet.ohchr.org/_layouts/treatybodyexternal/Download.aspx?symbolno=INT%2fCERD%2fADR%2fBEL%2f15562&Lang=en" target="_blank" style="text-decoration:underline;">View document</a>
</td><td style="display:none;"> </td><td style="display:none;">F</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">INT_CERD_ADR_BEL_15562_F.pdf</td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;"> </td><td style="display:none;">INT/CERD/ADR/BEL/15562</td><td style="display:none;"> </td><td style="display:none;">True</td>
</tr>
</tbody>
I am trying the following line to see how it goes instead:
while ($content =~ m/rgRow.+?<td>(.+?)<\/td><td>(.+?)<\/td><td>(.+?)<\/td><td>(.+?)<\/td><td>(.+?)<\/td><td>(.+?)<\/td>/gs)
The original code was not mine.