0

Firstly, may I point out that I am a newcomer to all things PHP so apologies if anything here is unclear and I'm afraid the more layman the response the better. I've been having real trouble parsing an xml file in to php to then populate an HTML table for my website. At the moment, I have been able to get the full xml feed in to a string which I can then echo and view and all seems well. I then thought I would be able to use simplexml to pick out specific elements and print their content but have been unable to do this.

The xml feed will be constantly changing (structure remaining the same) and is in compressed format. From various sources I've identified the following commands to get my feed in to the right format within a string although I am still unable to print specific elements. I've tried every combination without any luck and suspect I may be barking up the wrong tree. Could someone please point me in the right direction?!

$file = fopen("compress.zlib://$url", 'r');
$xmlstr = file_get_contents($url);
$xml = new SimpleXMLElement($url,null,true);

foreach($xml as $name) {
  echo "{$name->awCat}\r\n";
}

Many, many thanks in advance,

Chris

PS The actual feed

Denis de Bernardy
  • 75,850
  • 13
  • 131
  • 154
Chris
  • 57
  • 1
  • 1
  • 7
  • 1
    Gordon, thanks I'm just trying to do this now without creating the string... I just chose awCat as an example, I'll actually be using other elements :) – Chris May 27 '11 at 11:42

5 Answers5

2

Since no one followed my closevote, I think I can just as well put my own comments as an answer:

First of all, SimpleXml can load URIs directly and it can do so with stream wrappers, so your three calls in the beginning can be shortened to (note that you are not using $file at all)

$merchantProductFeed = new SimpleXMLElement("compress.zlib://$url", null, TRUE);

To get the values you can either use the implicit SimpleXml API and drill down to the wanted elements (like shown multiple times elsewhere on the site):

foreach ($merchantProductFeed->merchant->prod as $prod) {
    echo $prod->cat->awCat , PHP_EOL;
}

or you can use an XPath query to get at the wanted elements directly

$xml = new SimpleXMLElement("compress.zlib://$url", null, TRUE);
foreach ($xml->xpath('/merchantProductFeed/merchant/prod/cat/awCat') as $awCat) {
    echo $awCat, PHP_EOL;
}

Live Demo

Note that fetching all $awCat elements from the source XML is rather pointless though, because all of them have "Bodycare & Fitness" for value. Of course you can also mix XPath and the implict API and just fetch the prod elements and then drill down to the various children of them.

Using XPath should be somewhat faster than iterating over the SimpleXmlElement object graph. Though it should be noted that the difference is in an neglectable area (read 0.000x vs 0.000y) for your feed. Still, if you plan to do more XML work, it pays off to familiarize yourself with XPath, because it's quite powerful. Think of it as SQL for XML.

For additional examples see

Community
  • 1
  • 1
Gordon
  • 312,688
  • 75
  • 539
  • 559
  • 1
    Gordon, thanks for the excellent summary. the awCat example was used just as it was the child element I spotted. Aware that they are all the same, I'll be generating the table with the other elements. I'll go for the Xpath option as you mentioned, thanks again... Chris – Chris May 27 '11 at 11:55
1

Try this...

$url = "http://datafeed.api.productserve.com/datafeed/download/apikey/58bc4442611e03a13eca07d83607f851/cid/97,98,142,144,146,129,595,539,147,149,613,626,135,163,168,159,169,161,167,170,137,171,548,174,183,178,179,175,172,623,139,614,189,194,141,205,198,206,203,208,199,204,201,61,62,72,73,71,74,75,76,77,78,79,63,80,82,64,83,84,85,65,86,87,88,90,89,91,67,92,94,33,54,53,57,58,52,603,60,56,66,128,130,133,212,207,209,210,211,68,69,213,216,217,218,219,220,221,223,70,224,225,226,227,228,229,4,5,10,11,537,13,19,15,14,18,6,551,20,21,22,23,24,25,26,7,30,29,32,619,34,8,35,618,40,38,42,43,9,45,46,651,47,49,50,634,230,231,538,235,550,240,239,241,556,245,244,242,521,576,575,577,579,281,283,554,285,555,303,304,286,282,287,288,173,193,637,639,640,642,643,644,641,650,177,379,648,181,645,384,387,646,598,611,391,393,647,395,631,602,570,600,405,187,411,412,413,414,415,416,649,418,419,420,99,100,101,107,110,111,113,114,115,116,118,121,122,127,581,624,123,594,125,421,604,599,422,530,434,532,428,474,475,476,477,423,608,437,438,440,441,442,444,446,447,607,424,451,448,453,449,452,450,425,455,457,459,460,456,458,426,616,463,464,465,466,467,427,625,597,473,469,617,470,429,430,615,483,484,485,487,488,529,596,431,432,489,490,361,633,362,366,367,368,371,369,363,372,373,374,377,375,536,535,364,378,380,381,365,383,385,386,390,392,394,396,397,399,402,404,406,407,540,542,544,546,547,246,558,247,252,559,255,248,256,265,259,632,260,261,262,557,249,266,267,268,269,612,251,277,250,272,270,271,273,561,560,347,348,354,350,352,349,355,356,357,358,359,360,586,590,592,588,591,589,328,629,330,338,493,635,495,507,563,564,567,569,568/mid/2891/columns/merchant_id,merchant_name,aw_product_id,merchant_product_id,product_name,description,category_id,category_name,merchant_category,aw_deep_link,aw_image_url,search_price,delivery_cost,merchant_deep_link,merchant_image_url/format/xml/compression/gzip/";
$zd = gzopen($url, "r");
$data = gzread($zd, 1000000);
gzclose($zd);

if ($data !== false) {
    $xml = simplexml_load_string($data);
    foreach ($xml->merchant->prod as $pr) {         
        echo $pr->cat->awCat . "<br>";
    }
}
fire
  • 21,383
  • 17
  • 79
  • 114
1
    <?php

    $xmlstr = file_get_contents("compress.zlib://$url"); 
    $xml = simplexml_load_string($xmlstr);

    // you can transverse the xml tree however you want
    foreach ($xml->merchant->prod as $line) {         
        // $line->cat->awCat -> you can use this
    }

more information here

Highstrike
  • 462
  • 3
  • 14
0

Use print_r($xml) to see the structure of the parsed XML feed.

Then it becomes obvious how you would traverse it:

foreach ($xml->merchant->prod as $prod) {
     print $prod->pId;
     print $prod->text->name;
     print $prod->cat->awCat;   # <-- which is what you wanted
     print $prod->price->buynow;
}
mario
  • 144,265
  • 20
  • 237
  • 291
  • Thanks a lot Mario, Fire's answer above worked straight off and unfort can't mark both of you as the solution! Chris – Chris May 27 '11 at 11:16
0
$url = 'you url here';
$f = gzopen ($url, 'r');
$xml = new SimpleXMLElement (fread ($f, 1000000));

foreach($xml->xpath ('//prod') as $name) 
{
  echo (string) $name->cat->awCatId, "\r\n";
}
akond
  • 15,865
  • 4
  • 35
  • 55
  • Though it's likely not doing much of a difference in this particular case, using a direct path to the wanted elements, e.g. `/merchantProductFeed/merchant/prod/cat/awCat` instead of `//prod` should be somewhat faster because the XPath doesnt have to search everywhere. – Gordon May 27 '11 at 11:31