How to screen scrape a particular website. I need to log in to a website and then scrape the inner information. How could this be done?
Please guide me.
Duplicate: How to implement a web scraper in PHP?
How to screen scrape a particular website. I need to log in to a website and then scrape the inner information. How could this be done?
Please guide me.
Duplicate: How to implement a web scraper in PHP?
Curl, and once ure in, use QueryPath php library. (querypath.org) You can access dom elements just like in JQuery, via CSS selectors, there's method chaining...
Way better than just using php's native xml functions.
It also works as drupal extension, but I suppose you could implement it in any php project.
You want to look at the curl functions - they will let you get a page from another website. You can use cookies or HTTP authentication to log in first then get the page you want, depending on the site you're logging in to.
Once you have the page, you're probably best off using regular expressions to scrape the data you want.
You might also want to take a look at BeautifulSoup which is a Python library which is supposed to be very good at making bad HTML parseable. It is aimed at things like screen scraping.
How easy it would be to call from PHP I don't know though.