0

I'm working with XML documents returned by an API. The XML returns a list of products, with attributes for each project such as inventory, item number, name, price, etc.

I can loop through all of the XML tables creating lists of all the products with the appropriate fields displayed. The problem is, I need to be able to define certain products and their variables.

How can I create classes, or arrays from the XML products, but only for certain ones? For example, there may be 40 products returned, but I may only need 3 of them. The array or class must contain all the relevant information for the product.

Here's a link to an example of the raw XML returned by the API

An example of the XML is

<diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
<NewDataSet xmlns="">
    <Table diffgr:id="Table1" msdata:rowOrder="0">
        <ChargeDescID>14249</ChargeDescID>
        <sChgDesc>Cleaning Deposit</sChgDesc>
        <dcPrice>0.0000</dcPrice>
        <dcTax1Rate>0</dcTax1Rate>
        <dcTax2Rate>0</dcTax2Rate>
        <dcInStock>0.0000</dcInStock>
    </Table>
    <Table diffgr:id="Table2" msdata:rowOrder="1">
        <ChargeDescID>14251</ChargeDescID>
        <sChgDesc>Utility Knife</sChgDesc>
        <dcPrice>2.9900</dcPrice>
        <dcTax1Rate>9</dcTax1Rate>
        <dcTax2Rate>0</dcTax2Rate>
        <dcInStock>0.0000</dcInStock>
    </Table>
</NewDataSet>

So using the second product in the above code as an example, I'd like a PHP function that will create either a class, or an array like this:

$utility_knife (
  "ChargeDescID" => "14251",
  "dcPrice" => "2.99",
  "dcTax1Rate" => "9",
  "dcTax2Rate" => "0",
  "dcInStock" => "0",
 )

How can I pull out specific diffgr tables and format them into named arrays or classes, ignoring the tables I don't need? How do I get the contents of a table while ignoring the other tables?

Edit to include what I've attempted:

I've already been able to pull them into an array using loadXML() and DOMXPath as follows:

$dom = new DOMDocument;
$dom->loadXML($result);
$xpath = new DOMXPath($dom);
$el = $xpath->query('//Table');

#loop through results
foreach($el as $units) {

    $ChargeDescID = $xpath->query('ChargeDescID', $units)->item(0)->nodeValue;
    $sChgDesc = $xpath->query('sChgDesc', $units)->item(0)->nodeValue;
    $dcPrice = $xpath->query('dcPrice', $units)->item(0)->nodeValue;
    $dcTax1Rate = $xpath->query('dcTax1Rate', $units)->item(0)->nodeValue;
    $dcTax2Rate = $xpath->query('dcTax2Rate', $units)->item(0)->nodeValue;
    $dcInStock = $xpath->query('dcInStock', $units)->item(0)->nodeValue;

    #create oragnized array for results
    $iterate_list = array("ChargeDescID"=>$ChargeDescID,"sChgDesc"=>$sChgDesc, "dcPrice"=>$dcPrice, "dcTax1Rate"=>$dcTax1Rate, "dcTax2Rate"=>$dcTax2Rate, "dcInStock"=>$dcInStock);
    #create/append array of array
    $results_list[] = $iterate_list;

}

So now I have an array named $results_list with sub arrays organized in $key/$value pairs, but the arrays themselves aren't named. I can print out the results in an organized fashion as so:

        foreach($results_list as $key => $value) {

    echo 
        "Charge Description ID: " . $results_list[$key]["ChargeDescID"] . "<br>" .
        "Item Description: " . $results_list[$key]["sChgDesc"] . "<br>" .
        "Item Price: " . $results_list[$key]["dcPrice"] . "<br>" .
        "Tax rate 1: " . $results_list[$key]["dcTax1Rate"] . "<br>" .
        "tax rate 2: " . $results_list[$key]["dcTax2Rate"] . "<br>" .
        "In Stock: " . $results_list[$key]["dcInStock"] . "<br>"
        ;
    }

EDIT: This solution I proposed below worked. I'm currently using it, but I'm open to more elegant solutions. Preferably one that selects the node directly without the need of conditional logic. ThW proposed something that may work. I'm waiting for clarification from him.

The problem is that I can't figure out how to get specific products. I know I can use the array index, but the index may change from one pull to the next. I think I need some sort of function that says something similar to:

foreach($results_list as $key => $value){
  if ( $results_list[$key]["sChgDesc"] == "Utility Knife" ) {
    $utility_knife = array(
      "sChgDesc" => $results_list[$key]["sChgDesc"],
      "dcPrice" => $results_list[$key]["dcPrice"],
      "dcTax1Rate" => $results_list[$key]["dcTax1Rate"],
      "dcInStock" => $results_list[$key]["dcInStock"],
  );

and then write out the if statements for each product I need within the array.Is that about right?

I've tried wrapping my head around how to do this so many times now that I'm starting to confuse myself. I wasn't sure if I could call an if statement on one of the sub-array values and then loop back through the rest of the values in that particular sub array if the value exists.

The criteria of what needs to be picked out is that I have to be able to identify which product I'm choosing. So it could be dependent on the ChargeDescID, or the sChgDesc, but not really anything else. I then need to make sure the other relevant fields are populated.

Longblog
  • 821
  • 2
  • 11
  • 19
  • What have you tried? There are any number of ways to do this. You have also given no criteria at all for how you would like to pick out which item in the XML you want to translate to objects. – Mike Brant Mar 23 '15 at 20:59
  • Use `simplexml` functions to read your xml dom then create your array. – HddnTHA Mar 23 '15 at 21:00
  • Use an XML parser. Use XPath to get the nodes you're interested in. Like @MikeBrant wrote, there are a thousand ways to do this. – hakre Mar 23 '15 at 21:01
  • These examples / concepts might be useful in your case: [Nested XML to MySQL tables using php](http://stackoverflow.com/q/24707397/367456) and [SimpleXML & PHP: Extract part of XML document & convert as Array](http://stackoverflow.com/q/7015084/367456) - next to that you might be interested in XML namespaces. – hakre Mar 23 '15 at 21:14
  • Hi guys, I have added additional details explaining what I have already tried and how I think I need to go about solving the problem. Any feedback you can give with the additional information will be appreciated. Thanks. – Longblog Mar 23 '15 at 21:57
  • Hey, my "maybe" code I wrote up there worked. lol. I guess I figured it out on my own. Wewt! Is this the most efficient way to do this, or is there a better way? – Longblog Mar 23 '15 at 22:33
  • What do you mean by efficient in context of your question? Can you elaborate how efficiency could be measured in your scenario? In which sense do you feel your code is not efficient for you? – hakre Mar 23 '15 at 23:46
  • What I was really asking two questions:. "is there a more elegant way to write the code" and "is there a way to write a similar function that processes faster". – Longblog Mar 23 '15 at 23:49

2 Answers2

2

You're already using XPath, but it can do a lot more. DOMXpath::query() is limited, too. Use DOMXpath::evaluate() - it can return scalar values.

For example:

$ChargeDescID = $xpath->query('ChargeDescID', $units)->item(0)->nodeValue;

Can be refactored to:

$ChargeDescID = $xpath->evaluate('string(ChargeDescID)', $units);

XPath can contain complex conditions. Let's say you want to fetch the Table element nodes with the diffgr:id attribute Table1:

$xpath->registerNamespace('diffgr', 'urn:schemas-microsoft-com:xml-diffgram-v1');
$el = $xpath->evaluate('//Table[@diffgr:id="Table1"]');

XPath does not have a default namespace so if you want to address nodes in a different namespace then the empty namespace (xmlns="") you need to register a prefix for it. This can be the same prefix like in the document or a different one.

On the other side you can fetch nodes by name or more generic. * represents any element node.

$dom = new DOMDocument();
$dom->loadXml($xml);
$xpath = new DOMXPath($dom);
$xpath->registerNamespace('diffgr', 'urn:schemas-microsoft-com:xml-diffgram-v1');
$el = $xpath->evaluate('//Table[@diffgr:id="Table1"]');

$results_list = [];
foreach ($el as $units) {
  $iterate_list = [];
  foreach ($xpath->evaluate('*', $units) as $valueNode) {
    $iterate_list[$valueNode->localName] = $valueNode->nodeValue;
  }
  $results_list[] = $iterate_list;
}    
var_dump($results_list);

Output:

array(1) {
  [0]=>
  array(6) {
    ["ChargeDescID"]=>
    string(5) "14249"
    ["sChgDesc"]=>
    string(16) "Cleaning Deposit"
    ["dcPrice"]=>
    string(6) "0.0000"
    ["dcTax1Rate"]=>
    string(1) "0"
    ["dcTax2Rate"]=>
    string(1) "0"
    ["dcInStock"]=>
    string(6) "0.0000"
  }
}
ThW
  • 19,120
  • 3
  • 22
  • 44
  • So under your example, the $results_list isn't really even necessary in this example. Can I define multiple paths for the poorly named $el variable? What I mean is could I have $cleaning_deposit = $xpath->evaluate('//Table[@diffgr:id="Table1"]'), and then another that's $utility_knife= $xpath->evaluate('//Table[@diffgr:id="Table2"]')? Doing this would cause me to have to write multiple loops though. I've already found a solution that works for me, but I'm always open to refactoring to better code. – Longblog Mar 24 '15 at 14:01
  • I've just thought of another possible roadblock to the direct selection model. I don't know how the XML document updates when a new product is entered through the API. If it pushes everything down the hierarchy, then it might cause all of the products to become wrong when something is added. It would still work though if new products are appended to the end of the document. – Longblog Mar 24 '15 at 14:12
  • The XPath can be more complex and select multiple nodes, yes. For example: `//Table[@diffgr:id="Table1" or @diffgr:id="Table2"]`. If updates change the result depends on the updates. If the structure/format changes, you will have to adapt the XPath. – ThW Mar 24 '15 at 16:04
-1

You already tried something this:

 $xml = simplexml_load_file('file_xml.xml');

 print '<pre>';

 print_r($xml);

The print_r used only for debugging

Geovani Santos
  • 391
  • 1
  • 7