I have the following input:
<table class="fiche_table_caracter"><tbody>
<tr>
<td class="caracteristique"><strong>Design</strong></td>
<td>Classique (full tactile)</td>
</tr>
<tr>
<td class="caracteristique"><strong>Système d'exploitation (OS)</strong></td>
<td>iOS</td>
</tr>
<tr>
<td class="caracteristique"><strong>Ecran</strong></td>
<td>4,7'' (1334 x 750 pixels)<br />16 millions de couleurs</td>
</tr>
<tr>
<td class="caracteristique"><strong>Mémoire interne</strong></td>
<td>128 Go, 1 Go RAM</td>
</tr>
<tr>
<td class="caracteristique"><strong>Appareil photo</strong></td>
<td>8 mégapixels</td>
</tr>
</tbody>
</table>
I need to extract only the content of the <td>
tags. This is what I did:
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"<tr*>(.*?)</tr>" options:NSRegularExpressionCaseInsensitive error:NULL];
NSArray *myArray = [regex matchesInString:str options:0 range:NSMakeRange(0, [str length])] ;
UA_log(@"counttt: %d", [myArray count]);
NSMutableArray *matches = [NSMutableArray arrayWithCapacity:[myArray count]];
for (NSTextCheckingResult *match in myArray) {
NSRange matchRange = [match rangeAtIndex:1];
[matches addObject:[str substringWithRange:matchRange]];
NSLog(@"Regex output:%@", [matches lastObject]);
NSString * str2 = [matches lastObject];
NSRegularExpression *regex2 = [NSRegularExpression regularExpressionWithPattern:@"<td*>(<strong>)?(.*?)(</strong>)?</td>" options:NSRegularExpressionCaseInsensitive error:NULL];
NSArray *myArray2 = [regex2 matchesInString:str2 options:0 range:NSMakeRange(0, [str2 length])] ;
UA_log(@"counttt: %d", [myArray2 count]);
NSMutableArray *matches2 = [NSMutableArray arrayWithCapacity:[myArray2 count]];
for (NSTextCheckingResult *match2 in myArray2) {
NSRange matchRange2 = [match2 rangeAtIndex:1];
[matches2 addObject:[str2 substringWithRange:matchRange2]];
NSLog(@"Regex2 output:%@", [matches2 lastObject]);
NSString * lastObject2 = [matches2 lastObject];
}
}
The issue I get is that I would like to set the tag <Strong>
as optional but it doesn't work. With this code, I could extract the "tr" but not the content of the "td".
Please help!
I would like to extract:
1-
Design
Classique (full tactile)
2-
Système d'exploitation (OS)
iOS
3-
Ecran
16 millions de couleurs
4-
Mémoire interne
128 Go, 1 Go RAM