1

Possible Duplicate:
How to parse and process HTML with PHP?

I am learning PHP and when I have to extract (parse) some data from a webpage that does not have an available API, I use regular expressions or a function which takes the string that is between two strings.

I would like to know if there is a more "professional", easier way to do this, since regexp are resource consuming and not the easiest thing to write right now for me.

Community
  • 1
  • 1

2 Answers2

1

You should never try to parse XML (html) using regular-expressions, instead get yourself a proper parser library for XML and do it the correct way. I might sound like a harder task but you'll thank yourself in the end.

Parsing could be done using one of the below, or similar resources.


The popular and legendary answer regarding html and regular-expressions, poetry worth reading:

Community
  • 1
  • 1
Filip Roséen - refp
  • 62,493
  • 20
  • 150
  • 196
1

PHP comes with a default XML parsing library for you to use in this specific case. Use file_get_contents in order to retrieve the HTML page and parse accordingly.

XML: http://php.net/manual/en/book.xml.php

file_get_contents: http://php.net/manual/en/function.file-get-contents.php

Daniel Li
  • 14,976
  • 6
  • 43
  • 60