0

I'm looking to parse an export from wordpress using Regex to import it into a custom blog application. I've tried a number of ways to try to get to the data, but have been unsuccessful. I have:

<category domain="category" nicename="category-name"><![CDATA[Category Name]]></category>

I'm looking to find all of the text where: <![CDATA[Category Name]]> is. It also match the attribute with domain="category", but I do not care what the "nicename" is. This is important because other <category> elements have domain="post_tag" in them, which I do not want.

Michael Irigoyen
  • 22,513
  • 17
  • 89
  • 131
Chris G
  • 6,700
  • 2
  • 18
  • 20
  • possible duplicate of [How do I retrieve element text inside CDATA markup via XPath?](http://stackoverflow.com/questions/568315/how-do-i-retrieve-element-text-inside-cdata-markup-via-xpath) – Marc B Jun 05 '13 at 15:43
  • heck, maybe even truplicate! so what – Funk Forty Niner Jun 05 '13 at 15:45
  • Sorry, not really what I was looking for. I wanted a regex solution, not to use another tool. – Chris G Jun 05 '13 at 15:50
  • @ChrisG Hey man, no need to be sorry. You're `entitled` to ask for help. So what if it's a (groan) *possible* duplicate (BFD). Ask and hopefully, ye shall get answers. Cheers! – Funk Forty Niner Jun 05 '13 at 15:52
  • 2
    @ChrisG - A regex solution is really not suited for what you're trying to do. It can work, but it's not very practical. – Expedito Jun 05 '13 at 15:56
  • @PédeLeão yes, that's what it's looking like unfortunately. – Chris G Jun 05 '13 at 16:01

1 Answers1

1

Using SimpleXML:

Demo

$obj = simplexml_load_string($xml);
foreach($obj->category as $c)
{
   if($c->attributes()->domain == 'category')
   {
     echo (string)$c; // echo the content
   }
}
MrCode
  • 63,975
  • 10
  • 90
  • 112
  • Thank you for this. I thought about converting it to a simplexml object, but then I would have to re-write my whole import script. I'm using one that I wrote a while back that was compatible with WP at that time...using regex. Now with the upgrades that they have done, it no longer works correctly. I may have to abandon ship and re-write it anyways if I can't find a regex solution. Thanks again. – Chris G Jun 05 '13 at 15:55
  • 1
    @ChrisG yeah as others have said, Regex is not really suited for the complexities of XML. Good luck :) This answer comes to mind: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – MrCode Jun 05 '13 at 16:05