<tr>
11:15
12:15
13:15
</tr>
<tr>
18:15
19:15
20:15
</tr>
in this case output should be: [ (11:15, 12:15, 13:15), (18:15, 19:15, 20:15) ]
My pattern: (\d\d:\d\d)[\s\S]*?(\d\d:\d\d)[\s\S]*?(\d\d:\d\d)[\s\S]*?</tr>
will work only if there are 3 hours in each tr tag
But this should work if there are 1-3 hours (in the same format \d\d:\d\d) in each tr tag. Another example. For this my pattern doesn't work anymore.
<tr>12:00 13:00</tr>
<tr>14:00 15:00 16:00</tr>
<tr>12:00</tr>
Output should be: [ (12:00, 13:00, ), (14:00, 15:00, 16:00), (12:00, , ) ]
And here's another thing: every hour isn't separated by just whitespaces, the real file looks like this:
I used [\s\S]*? or [\w\s<>="-/:;?|]*?
for this. An hour is either in simple span or in longer form
.
example:
<tr>
<span class="na">16:00</span>
<span>|</span><a href="http:/21.28.147.68/msi/default.aspx?event_id=52514&typetran=1&ReturnLink=http://www.kino.pl/kina/przedwiosnie/repertuar.php" class="toolBox" data-hasqtip="true" aria-describedby="qtip-0">20:45</td>
</tr>