I hope, someone knows, what is wrong. I made a parser to get all the
<a href="blabla">Link</a>
tags. I test it on http://www.bbc.co.uk/. There are 261 of them on the page I test, and I receive only first 159. I checked it manually, I find every single one from them, but my resulting array has only 159 elements. What is the cause of that limit?
preg_match_all('/<a\s[^\>]*href\=[\'"]?((?:http\:\/\/)?(?:[_\-a-zA-Z0-9\.]*[_a-zA-Z0-9\.\/]))*[\'"]/', $page, $matches);
I checked, curl gives me all the page from
<html>
till
</html>
The problem is to make parser without any DOM usage, just curl and regexp.