56

Possible Duplicate:
How to parse and process HTML with PHP?

Suggestion for a reference question. Stack Overflow has dozens of "How to parse HTML" questions coming in every day. However, it is very difficult to close as a duplicate because most questions deal with the specific scenario presented by the asker. This question is an attempt to build a generic "reference question" that covers all aspects of the issue.

This is an experiment. If such a reference question already exists, let me know and I'll happily remove this one.

My ideal vision is that each of the three questions gets answered separately, and the best answers to each bubble up to the top.

I will be awarding a 200 bounty to the best answer in each of the three categories two weeks from now, pending discussion of this question on Meta.

Each of these questions have already been answered brilliantly elsewhere, so copy+pasting your own answer to a different question is fine with me.

How do I parse HTML with PHP?

  1. What libraries are there? Which ones use PHP's native DOM, which ones come with their own parsing engine? (Hint: SimpleHTMLDOM)

    1a. I need to find a specific element, but I find it hard to get used to the XPath syntax. Are there any DOM-based libraries that make parsing HTML easier? Please consider making generic real world examples.

  2. Is there a PHP library that enables me to query the DOM using CSS[2/3] selectors, like jQuery does? (Hint: phpQuery) Please consider making generic real world examples.

  3. Bonus question: Why shouldn't I use regular expressions? Please provide a very short answer in layman's terms.

Community
  • 1
  • 1
Pekka
  • 442,112
  • 142
  • 972
  • 1,088
  • 2
    @Pekka Yes, but that's just an excuse for those few people who actually *do* closevote to be as lazy as the ppl who asked the duplicate then. I understand why you feel like that. You know I closevote a lot. A reference question *would* make **my** life easier, but providing the answers here is not a good approach IMO. It's a CW. We don't get any reputation. Give people a chance to earn reputation for their answers. A link collection would be much better suited. – Gordon Sep 06 '10 at 10:09

0 Answers0