5

I'm parsing XML from a data source we use in our web application and I'm having some trouble accessing data from a specific part in the XML.

First, here's the output of a print_r on what I'm trying to access.

SimpleXMLElement Object
(
    [0] => 
This is the value I'm trying to get

)

Then, here's the XML I'm trying to get.

<entry>    
    <activity:object>
        <activity:object-type>http://activitystrea.ms/schema/1.0/note</activity:object-type>
        <id>542</id>
        <title>
        Title string is a string
        </title>
        <content>
        This is the value I'm trying to get
        </content>
        <link rel="alternate" type="html" href="#"/>
        <link rel="via" type="text/html" href="#"/>
    </activity:object>
</entry>

The content element is what I'm after.

When I access it with $post->xpath('activity:object')[0]->content I end up with what's above.

I've tried using $zero = 0; as well as ->content->{'0'} to access this element, but each time I just get an empty SimpleXML object returned, like below.

SimpleXMLElement Object
(
)

Is there another way to access this that I haven't found yet?

Thanks!

Shawn White
  • 319
  • 2
  • 10

3 Answers3

2

xpath returns a simpleXMLElement type, which has a function to convert it into a string. Try this function:

http://php.net/manual/en/simplexmlelement.tostring.php

Quentin Skousen
  • 1,035
  • 1
  • 18
  • 30
  • Unfortunately I just get an error saying I can't run a static method non-statically. I was really hoping it would be simple like this though! – Shawn White Dec 15 '14 at 15:57
1

You should just be able to access it directly:

$content = $post->xpath('//content');

echo $content[0];

With PHP 5.4 or higher you might be able to do do this:

$content = $post->xpath('//content')[0]; 

Or if you convert the XML to string like @kkhugs says you can use

/**
 * substr_delimeters
 *
 * a quickly written, untested function to do some string manipulation for
 * not further dedicated and unspecified things, especially abused for use
 * with XML and from http://stackoverflow.com/a/27487534/367456
 *
 * @param string $string
 * @param string $delimeterLeft
 * @param string $delimeterRight
 *
 * @return bool|string
 */
function substr_delimeters($string, $delimeterLeft, $delimeterRight)
{
    if (empty($string) || empty($delimeterLeft) || empty($delimeterRight)) {
        return false;
    }

    $posLeft = stripos($string, $delimeterLeft);
    if ($posLeft === false) {
        return false;
    }

    $posLeft += strlen($delimeterLeft);

    $posRight = stripos($string, $delimeterRight, $posLeft + 1);
    if ($posRight === false) {
        return false;
    }

    return substr($string, $posLeft, $posRight - $posLeft);
}

$content = substr_delimeters($xmlString, "<content>", "</content>");
hakre
  • 193,403
  • 52
  • 435
  • 836
Demodave
  • 6,242
  • 6
  • 43
  • 58
  • Unfortunately that doesn't work either. I just get back an empty object there as well. What I posted is actually just a snippet of my larger XML file (way too big to post here) but there's only one element holding the `activity:object` element, so I need to access that before `content`. Is there some other way to access XML elements with namespaces I don't know of? – Shawn White Dec 15 '14 at 15:48
  • What is your namespace before activity:object? – Demodave Dec 15 '14 at 15:53
  • I just modified the XML in the main post to reflect a more accurate view of the full string. It's just wrapped in ``. That's why I'm trying to access `` via xpath first. – Shawn White Dec 15 '14 at 15:56
  • @ShawnWhite - modified the code to just go straight to the content via $post->xpath('//content'); – Demodave Dec 15 '14 at 16:12
  • @ShawnWhite, which method did you decide to use? Please comment. – Demodave Dec 15 '14 at 16:27
  • I ended up using the function you posted. Dealing with ATOM XML seems to be a real pain and just grabbing the few things I need using that function works fantastically. Thanks! – Shawn White Dec 15 '14 at 16:34
0

print_r is always misleading regarding SimpleXMLElement. The output you have for example:

Code Reference:

$testate = $post->xpath('activity:object')[0]->content;
print_r($testate);

Output:

SimpleXMLElement Object
(
    [0] => 
This is the value I'm trying to get

)

It does not mean that you would need to access the text you're looking for by using array index zero ([0]). Well actually, while it doesn't mean that, it does not mean that it's not possible. Confusing, I know.

However, you're looking for a string value, not the object (value). All you need to do here is to cast to string:

$testate = $post->xpath('activity:object')[0]->content;
$text    = (string) $testate;
           ########

The important part here is really the casting to string. Like print_r already suggested to you, using the zero index would work, too:

$text    = (string) $testate[0];

But the zero-index is just not necessary and only internal information.

Just important to keep with you: Don't rely to print_r with SimpleXMLElement. It is an object and print_r merely tells you here that it's one and which name it has (which object type it is), the rest it outputs within the curly brackets is internal information of that object. It's never the whole picture of the XML you have. Even if it first seems so.

So just big warning here, keep that in mind. Consider casting to string (or use with string functions like trim()) and you're fine.

Also do not forget to read through Basic SimpleXML usage in the PHP manual.

P.S.: As the other answer shows, you're not the only one having problems to describe this magic nature of SimpleXML.

P.P.S.: you might want to learn about XML-Namespaces soon, that is when elements names have colons (you perhaps already did, I can not see it for sure from your code).

Community
  • 1
  • 1
hakre
  • 193,403
  • 52
  • 435
  • 836