how to use php to regular the text part and url part?

Question

I do not want to use a simple_html_dom, how to use a php regular to get url part 1.html 2.html 3.html and text part 111 222 333? Thanks.

<p>items</p>
<div>
<ul>
<li><a href="1.html">111</a></li>
<li><a href="2.html">222</a></li>
<li><a href="3.html">333</a></li>
</ul>
</div>

Why don't you want to use a dom parser? It would be the right tool for the job. — Pekka, Feb 25 '11 at 10:44
It's even listed on the index page of simple_html_dom website, under quick start. Did you even try to solve the problem yourself? — Andre Backlund, Feb 25 '11 at 10:47
@abloodywar, I have get the answer with simple_html_dom, but I still want to learn the regular expression. — cj333, Feb 25 '11 at 10:49
possible duplicate of [Best methods to parse HTML](http://stackoverflow.com/questions/3577641/best-methods-to-parse-html) — RobertPitt, Feb 25 '11 at 10:51
@cj333: If you want to **learn** the regular expressionS, then visit http://regular-expressions.info/ and http://stackoverflow.com/questions/89718/is-there-anything-like-regexbuddy-in-the-open-source-world - avoid writing `plzsendtehcodez` questions. — mario, Feb 25 '11 at 10:55
HTML should not be parsed with regex. Ever. If you want to learn regular expressions, learn them on something appropriate. — Lightness Races in Orbit, Feb 25 '11 at 10:59

Savetheinternet · Accepted Answer · 2011-02-25T10:58:06.937

6

By PHP regular, I'm presuming you mean PERL regular expression.

preg_match_all('/<li><a href="([^"]+)">(.+?)<\/a><\/li>/', $html, $matches);

Then $matches[1] will have a list of the linked documents and $matches[2] will have the text.

edited Feb 25 '11 at 10:58

answered Feb 25 '11 at 10:49

Savetheinternet

that name PERL regular expression. Thanks. – cj333 Feb 25 '11 at 10:53
1

The matches should rather use `.+?` to avoid eating up too much. Even better would be `[^"]+` and `[^<]+` for specificity. – mario Feb 25 '11 at 10:53
By "PERL regular expression", I'm presuming you mean "Perl-Compatible Regular Expressions". – Lightness Races in Orbit Feb 25 '11 at 11:01

1 Answers1