0

I have this wrongly formatted XML file with some content that I need. But I can't seem to get the SimpleXMLElement object in PHP to do that. The syntax in this XML file is supposed to be like this one, which is formatted just as it should be.

I keep getting errors such as Notice: Trying to get property of non-object.

The XML I'm interested in, is the first item's title and link.

Thank you in advance!

EDIT: I've tried with html_entity_decode, but it didn't manage to solve the problem on it's own. But I do believe a final solution would require this function.

  • Use html_entity_decode() no? – ke20 May 30 '13 at 12:50
  • I tried, but it didn't help. Though I do think the final solution would require it. –  May 30 '13 at 12:54
  • Can you give an example of your broken xml? – Martin Wickman May 30 '13 at 13:33
  • You cannot parse invalid XML using an XML parser; one of the most fundamental principles of XML is that it must be 100% valid in order to parse at all. All XML parsers (that follow the rules) will throw an error if you give them invalid XML. In cases like this, the best option is to complain to the provider of the broken XML. – Spudley May 30 '13 at 13:41

1 Answers1

0

Without seeing your code there's now way telling what's wrong. Anyhow the XML you pointed out is valid, here's a working example to retrieve the 1st item node title and link with the XML from the URL your provided.

To reduce the paste size here in SO I formatted the XML a bit and kept only 2 item nodes. See the full working example in http://codepad.viper-7.com/3UPARI.

<?php
$xml = <<<XML
<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
    <channel>
        <title>Latest CraftBukkit artifacts for Recommended Build</title>
        <link>http://dl.bukkit.org/downloads/craftbukkit/list/rb/</link>
        <description>The latest "CraftBukkit" artifacts for Recommended Build</description>
        <atom:link href="http://dl.bukkit.org/downloads/craftbukkit/feeds/latest-rb.rss" rel="self" />
        <language>en-us</language>
        <lastBuildDate>Thu, 31 Jan 2013 04:37:54 +0000</lastBuildDate>
        <item>
            <title>Recommended Build for CraftBukkit: 1.4.7-R1.0 (build 2624)</title>
            <link>http://dl.bukkit.org/downloads/craftbukkit/view/01845_1.4.7-R1.0/</link>
            <description>&lt;p&gt;This new version is 12.0 MB big.&lt;/p&gt;</description>
            <pubDate>Thu, 31 Jan 2013 04:37:54 +0000</pubDate>
            <guid>http://dl.bukkit.org/downloads/craftbukkit/view/01845_1.4.7-R1.0/</guid>
        </item>
        <item>
            <title>Recommended Build for CraftBukkit: 1.4.5-R1.0 (build 2543)</title>
            <link>http://dl.bukkit.org/downloads/craftbukkit/view/01707_1.4.5-R1.0/</link>
            <description>&lt;p&gt;This new version is 11.9 MB big.&lt;/p&gt;</description>
            <pubDate>Wed, 19 Dec 2012 11:14:13 +0000</pubDate>
            <guid>http://dl.bukkit.org/downloads/craftbukkit/view/01707_1.4.5-R1.0/</guid>
        </item>
    </channel>
</rss>
XML;

 $sxe = new SimpleXMLElement($xml);
 echo "Title: {$sxe->channel->item[0]->title}\n";
 echo "Link: {$sxe->channel->item[0]->link}\n";

Output

Title: Recommended Build for CraftBukkit: 1.4.7-R1.0 (build 2624)
Link: http://dl.bukkit.org/downloads/craftbukkit/view/01845_1.4.7-R1.0/
Rolando Isidoro
  • 4,983
  • 2
  • 31
  • 43
  • Did you use html_entity_decode to get that XML, because that's not what I get at all. –  May 30 '13 at 13:54
  • I didn't use html_entity_decode at all, I simply opened the [URL you provided](http://dl.bukkit.org/downloads/craftbukkit/feeds/latest-rb.rss) in Google Chrome (which is now resulting in an error), and copied the XML from view source. – Rolando Isidoro May 30 '13 at 14:06
  • Something is really weird with your internet settings. First of, I get no 404 message even though I clear my cache. Secondly, the XML looks like [this](http://pastebin.com/VaGAD8Mw). No matter what, PHP is unable to load it at my end. –  May 30 '13 at 14:11
  • 1
    Well, that's a game related website, so I managed to open it once before it was blocked by the firewall :| I created an example with the XML code you provided in pastebin at http://codepad.viper-7.com/3UPARI and it works. The HTML entities I see in the XML are only inside the `description` node, so you just have to run [html_entity_decode](http://php.net/manual/en/function.html-entity-decode.php) if you want to echo those values. – Rolando Isidoro May 30 '13 at 14:26
  • That seems as a valid solution. Could you by change post this as an answer so I can tick it? –  May 30 '13 at 14:36
  • The XML in the answer is a subset of the original one, just a little tidier so it can be readable here in Stack Overflow. Anyway I provide the link to the full solution so there's all the info need... just tick it like it is :D. – Rolando Isidoro May 30 '13 at 14:41