1

As I have mentioned in question title, I am trying below code to reach till the desired node in xpath result.

<?php
$xpath = '//*[@id="topsection"]/div[3]/div[2]/div[1]/div/div[1]';          
$html = new DOMDocument();
@$html->loadHTMLFile('http://www.flipkart.com/samsung-galaxy-ace-s5830/p/itmdfndpgz4nbuft');
$xml = simplexml_import_dom($html);   
if (!$xml) {
    echo 'Error while parsing the document';
    exit;
}

$source = $xml->xpath($xpath);
echo "<pre>";
print_r($source);
?>

this is the source code. I am using to scrap price from a ecommerce. it works it gives below output :

Array
(
    [0] => SimpleXMLElement Object
        (
            [@attributes] => Array
                (
                    [class] => line
                )

            [div] => SimpleXMLElement Object
                (
                    [@attributes] => Array
                        (
                            [class] => prices
                            [itemprop] => offers
                            [itemscope] => 
                            [itemtype] => http://schema.org/Offer
                        )

                    [span] =>  Rs. 10300
                    [div] => (Prices inclusive of taxes)
                    [meta] => Array
                        (
                            [0] => SimpleXMLElement Object
                                (
                                    [@attributes] => Array
                                        (
                                            [itemprop] => price
                                            [content] => Rs. 10300
                                        )

                                )

                            [1] => SimpleXMLElement Object
                                (
                                    [@attributes] => Array
                                        (
                                            [itemprop] => priceCurrency
                                            [content] => INR
                                        )

                                )

                        )

                )

        )

)

Now How to reach till directly [content] => Rs. 10300. I tried:

echo $source[0]['div']['meta']['@attributes']['content']

but it doesn't work.

Paresh Mayani
  • 127,700
  • 71
  • 241
  • 295
  • Object properties are accessed by `->`.... – Wrikken Dec 12 '12 at 15:21
  • "It doesn't work" is not a question. At which part in that long line do you hit the road block? Probably [“At sign” @ in SimpleXML object?](http://stackoverflow.com/q/4327873/367456) – hakre Dec 12 '12 at 15:28

2 Answers2

1

Try echo (String) $source[0]->div->meta[0]['content'];.

Basically, when you see an element is an object, you can't access it like an array, you need to use object -> approach.

Ranty
  • 3,333
  • 3
  • 22
  • 24
  • i tried to echo echo $source[0]->div->meta[0]->getAttribute('content'); but its giving error ! or i misinterpreted the stuff ? –  Dec 12 '12 at 15:26
  • SCREAM: Error suppression ignored for ( ! ) Fatal error: Call to undefined method SimpleXMLElement::getAttribute() in C:\wamp\www\pom\fk.php on line 14 –  Dec 12 '12 at 15:42
  • 1
    Oh right, it's my bad. Too much work with DOM lately. Try `echo (String) $source[0]->div->meta[0]['content'];` – Ranty Dec 12 '12 at 15:43
0

The print_r of a SimpleXMLElement does not show the real object structure. So you need to have some knowledge:

$source[0]->div->meta['content']
        |    |     |      `- attribute acccess
        |    |     `- element access, defaults to the first one
        |    `- element access, defaults to the first one
        |
 standard array access to get 
 the first SimpleXMLElement of xpath()
 operation

That example then is (with your address) the following (print_r again, Demo):

SimpleXMLElement Object
(
    [0] => Rs. 10300
)

Cast it to string in case you want the text-value:

$rs = (string) $source[0]->div->meta['content'];

However you can already directly access that node with the xpath expression (if that is a single case).

Learn more on how to access a SimpleXMLElement in the Basic SimpleXML usage ExamplesDocs.

hakre
  • 193,403
  • 52
  • 435
  • 836
  • 1
    For better replacements for `print_r` in this situation, have a look at https://github.com/IMSoP/simplexml_debug – IMSoP Dec 12 '12 at 18:13
  • @IMSoP: Oh nice. But please don't put software under Creative Commons Licences, that's not fitting well for software. *Edit:* I need to crosslink this to: http://stackoverflow.com/a/8631974/367456 – hakre Dec 12 '12 at 19:34
  • @hakre It's attribution only, no copyleft, so the idea was that just leaving that docblock in tact would be enough. I'm no expert though, so not sure if that's legally valid. – IMSoP Dec 13 '12 at 12:54
  • The problem with Creative Commons Licenses are that they have not been designed for software. As many developers know this, they have problems with code under it. If you want attribution ("only"), I'd say pick either [MIT](http://spdx.org/licenses/MIT) or [Apache 2.0](http://spdx.org/licenses/Apache-2.0) license (depending on the level of detail you prefer) - Those two licenses are well known, accepted, require keeping copyright headers and are permissive (no copyleft). If you ask me for an opinion, you probably just want [MIT](http://spdx.org/licenses/MIT). ;) – hakre Dec 13 '12 at 12:58
  • @hakre I know CC licences in general can be problematic, but CC-BY is about as minimal as it gets. And unlike the two you link, the URL is explicitly allowed as inclusion of terms, rather than pasting the full boilerplate somewhere. – IMSoP Dec 17 '12 at 16:56
  • @IMSoP: CC-BY is not fitting as well. But don't trust me, consult the CC FAQ: http://wiki.creativecommons.org/Frequently_Asked_Questions#Can_I_apply_a_Creative_Commons_license_to_software.3F - Also I don't want to push you in any direction, if it's for your own good to have that CC-BY license use it. But don't wonder if only users that don't know what they are doing (in licensing meaning) will be the userbase of your software. I just wanted to highlight you that, the decision is yours and I did understood you differently earlier. Take care and good luck. – hakre Dec 17 '12 at 16:59
  • @hakre Hm, fair enough; maybe I'll just remove it in favour of goodwill to stop people being put off. I just wanted a more "official"/binding way of saying "please give me credit" really, and a whole doc of legal boilerplate seems overkill. – IMSoP Dec 17 '12 at 18:05
  • MIT requires giving credits. And I can't imagine something shorter, so if you want just place the MIT license as a file called `COPYING` or `LICENSE` into the repository and inside the code you only need to say copyright and see LICENSE in file top comment. It's not allowed to remove author (copyright) credits anyway in most jurisdictions. See [How to Apply a License to Your Software](http://producingoss.com/en/license-quickstart.html#license-quickstart-applying) (there are no rules set in stone) and a further tip: [Using the SPDX License List for Tagging and Linking](http://wp.me/pLEEp-nj4). – hakre Dec 17 '12 at 18:24