-3

Possible Duplicate:
How to parse and process HTML with PHP?

Im trying to figure out how to get the word/words in the -tagg by regular expression. My content is this:

<li id="menu-item-90" class="menu-item menu-item-type-post_type menu-item-object-page menu-item-90"><a href="http://example.com/">Start</a></li>
<li id="menu-item-484" class="menu-item menu-item-type-custom menu-item-object-custom current-menu-item menu-item-484"><a href="http://example.com/test/">Test</a></li>
<li id="menu-item-375" class="menu-item menu-item-type-post_type menu-item-object-page menu-item-375"><a href="http://example.com/test2/">test number two</a></li>
<li id="menu-item-171" class="menu-item menu-item-type-post_type menu-item-object-page menu-item-171"><a href="http://example.com/test3/">Test 3</a></li>

So the above code I just want to get the following from:

  • Start
  • Test
  • test number two
  • Test 3

How do I accomplish that with preg_split and a regular expression on my formatted links? I have tried the following but my regular expression skills are'nt number one. Just outputs an empty array.

$tag = 'a';
$topMenuValues = preg_split('{<'.$tag.'[^>]*>(.*?)</'.$tag.'>}', $topMenuValues);
Community
  • 1
  • 1
JohnSmith
  • 417
  • 4
  • 10
  • 21
  • 1
    http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html Read it, understand it, and then use a real solution. – KingCrunch Jul 24 '12 at 13:11
  • Please use an [HTML parser](http://php.net/manual/en/class.domdocument.php) for this. – PeeHaa Jul 24 '12 at 13:15
  • Thanks for closing the topic in a second. I can now se that I have'nt provide with a good code example which i run above. But due the closed topic there is no use for me to edit.... – JohnSmith Jul 24 '12 at 13:23

1 Answers1

-1

You're splitting along the entire <a..>...</a> tag, meaning it matches the entire tag. The problem is with the (.*?) in the middle, you should be matching those with your delimiter regex. Try instead

'{(<'.$tag.'[^>]*>)|(</'.$tag.'>)}'

That being said, this will work for only a specific instance of your above html. You should really use a html parser

Robust and Mature HTML Parser for PHP

Community
  • 1
  • 1
Hans Z
  • 4,664
  • 2
  • 27
  • 50
  • Dear everybody downvoting everybody automatically who uses regex to parse html. In limited, specific cases of html where the user knows the exact format of the html, it is acceptable to parse with regex. Now shoo. – Hans Z Jul 24 '12 at 13:18
  • "where the use knows the exact format of the html" sounds just funny :) Just said... However, everyone you recommends, or even suggest to use regular expression to parse html is a cruel, cruel person :X My 2cent – KingCrunch Jul 24 '12 at 13:20
  • I agree with Hans Z. HTML and regex can be used together if you know what you are doing. This automated closing is annoying. It is more a habit/religion instead of a reasoned action. Please stop that. – Erwin Moller Jul 24 '12 at 13:22