1

I'm looking to pull file_get_contents info from a dynamically created URL from the source code of a page that I've already used file_get_contents to get. For example:

$link = $_POST['link'];
$html = file_get_contents('http://www.somesite.com/keywords='.$link."");
$output = file_get_contents(
//A URL that is in the output of $html
);

Essentially I want PHP to load an HTML page and then click through a link on that page and then grab the resulting source code, so I can parse some of it into variables to use later. Any idea?

WindowsDan
  • 59
  • 1
  • 9
  • Not sure what exactly you encounter. Just take `$html` and grab the link that you need. If you don't know how to do that, you'd need to provide some sample of what `file_get_contents('http://www.somesite.com/keywords='.$link."");` returns. – Kolja Feb 25 '15 at 16:29
  • How can you "use it later" without storing anything ? – Bang Feb 25 '15 at 16:34
  • @Bang The question says "parse some of it into variables to use later" – Kolja Feb 25 '15 at 19:18

1 Answers1

1

use regex.

$url = preg_match('(https?):\/\/(www\.)?[a-z0-9\.:].*?(?=\s)', $html);

EDIT:

This is a good explanation :)

What is the best regular expression to check if a string is a valid URL?

Community
  • 1
  • 1
Hazonko
  • 1,025
  • 1
  • 15
  • 30
  • Won't this check for all URL's when the page contains lots and I only want one, that is dynamic, and only generated when the page is loaded? – WindowsDan Feb 25 '15 at 16:40
  • If you know where the url is, surely you'd be able to target that. Or if you know it'll be the first url on the page just grab the first item in the $url array? – Hazonko Feb 25 '15 at 17:21
  • I know where it'll come up but not sure how to target that specifically. I know what stuff comes before or after it – WindowsDan Feb 25 '15 at 17:45
  • @WindowsDan can you post a sample of what this data would look like? – Kolja Feb 25 '15 at 19:23