1

I have some xml that looks like this:

<menu>
    <day name="monday">
        <meal name="BREAKFAST">
            <counter name="Bread">
                <dish>
                    <name>Plain Bagel
                        <info name="Plain Bagel">
                            <serving>1 Serving (90g)</serving>
                            <calories>200</calories>
                            <caloriesFromFat>50</caloriesFromFat>
                        </info>
                    </name>
                </dish>
                <dish>
                    <name>Applesauce Coffee Cake
                        <info name="Applesauce Coffee Cake">
                            <serving>1 Slice-Cut 12 (121g)</serving>
                            <calories>374</calories>
                            <caloriesFromFat>104</caloriesFromFat>
                        </info>
                    </name>
                </dish>
            </counter>
        </meal>
    </day>
</menu>

And now I am trying to get the number of tags that are under the info tag which should be three for the first info tag which has the attribute of Plain Bagel.

Like I said I am using Hpple parser for iOS. Here is what I have and am trying but can't quite get it to work.

- (void)getData:(NSData*)factData {
    TFHpple *Parser = [TFHpple hppleWithHTMLData:factData];
    NSString *XpathQueryString = @"//day[@name='monday']/meal[@name='BREAKFAST']/counter[@name='Bread']/dish/name/info[@name='Plain Bagel']";
    NSArray *Nodes = [Parser searchWithXPathQuery:XpathQueryString];
    NSInteger count = Nodes.count;
    NSLog(@"count: %ld", count);
    for (TFHppleElement *element in Nodes) {
        NSLog(@"count inside: %ld", element.children.count);
    }
}

And the first count give 1. Which is right but count inside gives 7, which is where I get confused. And not sure why this happens. After I get inside the info tag I want to loop through for each tag, serving, calories, and calories from fat and get each tags text. But Im not sure why it gives 7?

Thanks for the help in advance.

Rob
  • 415,655
  • 72
  • 787
  • 1,044
iqueqiorio
  • 1,149
  • 2
  • 35
  • 78

1 Answers1

1

The issue is that you're using a HTML parser not an XML parser. From a HTML perspective, you have seven elements between the info open and close tags:

  • some text (newline and spaces)
  • serving tag
  • some text (newline and spaces)
  • calories tag
  • some text (newline and spaces)
  • caloriesFromFat tag
  • some text (newline and spaces)

If you iterate through the children objects, you'll see precisely that.

If you want only the entries associated with tags, you can check to see if the node, has children of its own:

TFHpple *parser = [TFHpple hppleWithXMLData:factData];
NSString *xpathQueryString = @"//day[@name='monday']/meal[@name='BREAKFAST']/counter[@name='Bread']/dish/name/info[@name='Plain Bagel']";
NSArray *nodes = [parser searchWithXPathQuery:xpathQueryString];
for (TFHppleElement *element in nodes) {
    for (TFHppleElement *child in element.children) {
        if (child.children.count > 0) {  // see if the child, itself, has children
            NSLog(@"  %@: '%@'", child.tagName, child.content);
        }
    }
}

Or you could use a predicate:

TFHpple *parser = [TFHpple hppleWithXMLData:factData];
NSString *xpathQueryString = @"//day[@name='monday']/meal[@name='BREAKFAST']/counter[@name='Bread']/dish/name/info[@name='Plain Bagel']";
NSArray *nodes = [parser searchWithXPathQuery:xpathQueryString];
NSPredicate *predicate = [NSPredicate predicateWithBlock:^BOOL(TFHppleElement *node, NSDictionary *bindings) {
    return node.children.count > 0;
}];
for (TFHppleElement *element in nodes) {
    NSArray *filteredNodes = [element.children filteredArrayUsingPredicate:predicate];
    for (TFHppleElement *child in filteredNodes) {
        NSLog(@"  %@: '%@'", child.tagName, child.content);
    }
}

If you were using a proper XML parser (e.g. NSXMLParser) you wouldn't deal with random characters in between the open and close tags.

Rob
  • 415,655
  • 72
  • 787
  • 1,044
  • Thanks, is there a way to remove the newline characters and spaces before I parse it? – iqueqiorio Mar 01 '15 at 00:59
  • I was going to do something like `if (element != nil) {` and then NSLog inside and see if it logged 3 times but it still did 7, how could I check if that element is a `\n` – iqueqiorio Mar 01 '15 at 01:03
  • You can either look at the `tagName` or the number of sub children it has (I've done the latter, above). Frankly, though, I probably wouldn't use Hpple at all, but rather use `NSXMLParser`. – Rob Mar 01 '15 at 21:42