I want to find url links in html source code. I am using Hpple for parsing the HTML. I know by giving the path we will find the url in html. For different url's the path must be changed. So i am unable to search url links.
Let me explain clearly. For example, I am taking yahoo.com In this web site contains cricket, weather, sports, mail, news etc., I want to find that links in the source code.
What I am doing here is simply I am giving the path and search "a href". If the path is correct all urls present in the path will display. Remaining url's present in the same page(not in same path) are not getting. How should I do this?
TFHpple *htmlParser = [TFHpple hppleWithHTMLData:htmlData];
if (htmlData) //check that htmlData contains data
{
//Enter your Xpath query here to obtain the data you want from the webpage
//more info on Xpath queries can be found at http://www.w3schools.com/xpath/default.asp
NSString *content;
NSArray *nodes = [htmlParser searchWithXPathQuery:@"//html/head/link"];
//NSString *searchStr = @"a href";
for (TFHppleElement *element in nodes) {
NSString *href = [element attributes][@"href"];
if ([href rangeOfString:@"href"].location!= (NSNotFound)) {
NSLog(@"got it");
}
else
NSLog(@"not found");
content = [element content];
//NSLog(@"%@ -> %@", href, content);
[urlsArray addObject:href];
} //searching for all h2 in document
NSLog(@"urls array = %@", urlsArray);
//Set the textView text on the view to the result of the HTML parser
_displayTextView.text = [NSString stringWithFormat:@"%@", urlsArray];
}else{
//Display an error if htmlData is not available. I.E no internet connection etc
_displayTextView.text = @"Error - No data";
}
Thanks in advance.