1

I have a .html file on a server, which I need to parse info from. It´s nothing huge it´s just

<html>
<body>

<p> some text </p>
<p> some other text </p>

</body>
</html>

Is there a way I can put the text in all the <p> tags in an array as separate objects, so I can show them in a UITableView?

Gabriele Petrioli
  • 191,379
  • 34
  • 261
  • 317
Magnus
  • 1,444
  • 5
  • 22
  • 31
  • 1
    Look into [NSXMLParser](http://developer.apple.com/library/mac/#documentation/Cocoa/Reference/Foundation/Classes/NSXMLParser_Class/Reference/Reference.html). – sudo rm -rf Jun 18 '11 at 15:12

3 Answers3

4

You can parse it with libxml, here is a sample I wrote it for you:

#import <Foundation/Foundation.h>
#import <libxml/HTMLTree.h>
#import <libxml/HTMLparser.h>
#import <libxml/xpath.h>

@interface NSString(HTMLParser)
- (NSArray *)resultWithXPath:(NSString *)xpath;
@end

@implementation NSString(HTMLParser)

- (NSArray *)resultWithXPath:(NSString *)xpath
{
  htmlDocPtr doc = htmlParseDoc((xmlChar *)[[self dataUsingEncoding:NSUTF8StringEncoding] bytes], "UTF-8");
  xmlXPathContextPtr context = xmlXPathNewContext(doc);
  xmlXPathObjectPtr xpathobj = xmlXPathEvalExpression(BAD_CAST [xpath UTF8String], context);
  xmlNodeSetPtr nodeset = xpathobj->nodesetval;
  if (xmlXPathNodeSetIsEmpty(nodeset))
    return nil;

  NSMutableArray *result = [[NSMutableArray alloc] initWithCapacity:nodeset->nodeNr];

  for (int i=0; i<nodeset->nodeNr; i++){
    xmlNodePtr node = nodeset->nodeTab[i];
    [result addObject:[NSString stringWithCString:(char *)xmlNodeGetContent(node) encoding:NSUTF8StringEncoding]];
  }

  xmlXPathFreeObject(xpathobj);
  xmlXPathFreeContext(context);
  xmlFreeDoc(doc);

  return [result autorelease];
}

@end

int main (int argc, const char * argv[])
{
  NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];

  NSString *html = @"<html>\
  <body>\
  <p> some text </p>\
  <p> some other text </p>\
  </body>\
  </html>";

  NSArray *result = [html resultWithXPath:@"//p"];
  NSLog(@"result: %@", result);
  [pool release];

  return 0;
}
cxa
  • 4,238
  • 26
  • 40
2

Rather than encourage you to figure out how best to parse the HTML, may I suggest just leaving a static JSON file on your web server instead? There are many JSON parser libraries available for iOS that will allow you to get the data you need.

A side effect from doing this is that you will use less bandwidth in the download, it will be faster to parse, and the resulting code will be less brittle to changes in your data payload.

Wayne Hartman
  • 18,369
  • 7
  • 84
  • 116
-2

Use UIwebView in your tableViewCell. Or I suggest use three20 Framework's TTStyleLabel. It will display html properly parsed.

Rahul Vyas
  • 28,260
  • 49
  • 182
  • 256
  • 1
    How would putting a webview in my tableviewcell help me to put the text in the p tags in an array? – Magnus Jun 18 '11 at 15:28
  • 2
    DO NOT put a UIWebView in a UITableViewCell. UIWebView is a very heavy object and is not suited for many instances running at once. Besides, I seriously doubt that would even work. Because you're sticking what is essentially a UIScrollView (in the web view) in a cell which is in another UIScrollView (the table view) the scroll intercepts would get confused and it wouldn't work. – Wayne Hartman Jun 18 '11 at 15:52
  • 1
    +1 for `TTStyledLabel` has a basic xhtml parser and use the three20 stylesheet system to do a custom `UIView` rendering, with even links. It is how facebook app does... – Vincent Guerci Jun 18 '11 at 15:57