What is the best practice to find all links of an external website? I had a look around and it seems JavaScript along cannot scrape external sites because of the same origin policy. With PHP I can easily get a website's content with the file_get_contents()
function but how do I extract links only? Do I need to write a regex or is there any other, better way of doing this?
Asked
Active
Viewed 26 times
0

Charles Ingalls
- 4,521
- 5
- 25
- 33
-
Your question is not duplicated.. anyway this may help you work: http://www.hashbangcode.com/blog/extract-links-html-file-php – peterpeterson Jul 31 '14 at 13:37
-
Cheers, [Simple HTML Dom Parser](http://simplehtmldom.sourceforge.net/) seems like what I was looking for! – Charles Ingalls Jul 31 '14 at 13:44