I'm trying to find a specific string within an array element. Since array element is a string which can contain multiple occurrences of the string I perform recursive substitution of the result. Algorithm works on simple example, but when I use it with HTML (which is the purpose of the program) it stuck in an infinite while loop.
Here is an (ugly) expression that I'm using:
set expression {\<div\sclass\=\"fileText\"\sid\=\"[^\"]+\"\>File\:\s\<a\s(title\=\"[^\"]+\"\s)?href\=\"([^\"]+)\"\starget\=\"\_blank\"\>([^\<]+)\<\/a\>[^\<]+\<\/div\>};
Here is an element of the array I from which I want to extract strings (it containes 2 occurences of the given expression):
set htmlForParse(0) {file" id="f51456520"><div class="fileText" id="fT51456520">File: <a href="//example.com" target="_blank">48912-arduinouno_r3_front.jpg</a> (1022 KB, 1800x1244)</div><a class="fileThumb" href="//example.com" target="_blank"><img " title="Reply to this post">YesNo?</a></span></div><div class="file" id="f51456769"><div class="fileText" id="fT51456769">File: <a href="//example.com" target="_blank">892991578.jpg</a> (32 KB, 400x422)</div><a class="fileThumb" href="//example.com" target="_blank"><img src};
And here are the loops that I'm using to achieve this:
for {set k 0} {$k < [array size htmlForParse]} {incr k} {
while {[regexp $expression $htmlForParse($k) exString]} {
regsub -- $exString $htmlForParse($k) {} htmlForParse($k);
puts $htmlForParse($k);
} }
Purpose of the regsub
is to substitute one hit from regexp
at a time, until no hits are left and regexp
returns 0
. At that moment, while loop is finished, and next element of the array can be examined. But that doesn't happen, it continues to loop forever, and it seem that regsub
does not substitute found string with an empty string (nor will it substitute with anything else either). Why?