-4

let's say we load the source code of this question and we want to find the url alongside "childUrl"

or goto this site source code and search "childUrl".

<?php
$sites_html = file_get_contents("https://stackoverflow.com/questions/46272862/how-to-find-urls-under-double-quote");
$html = new DOMDocument();
@$html->loadHTML($sites_html);
foreach() {
# now i want here to echo the link alongside "childUrl"
}
?>
Emmanuel
  • 9
  • 2
  • 5

2 Answers2

0

Try this

<?php
    function extract($url){
    $sites_html = file_get_contents("$url");
    $html = new DOMDocument();
    $$html->loadHTML($sites_html);

      foreach ($html->loadHTML($sites_html) as $row)
      {
       if($row=="wanted_url") 
       {
         echo $row;
       }
     }
 }
?>
  • i got this error "Catchable fatal error: Object of class DOMDocument could not be converted to string" – Emmanuel Sep 18 '17 at 08:01
  • loadHTML($sites_html); foreach ($html->loadHTML($sites_html) as $row) { if((string)$row=="wanted_url") { echo $row; } } } ?> –  Sep 18 '17 at 08:04
-1

you can use regex: try this code

$matches = [[],[]];
preg_match_all('/\"wanted_url\": \"([^\"]*?)\"/', $sites_html, $matches);
foreach($matches[1] as $match) {
    echo $match;
}

this will print all urls with wanted_url tag

HSLM
  • 1,692
  • 10
  • 25
  • Parse error: syntax error, unexpected '[', expecting ']' in C:\wamp – Emmanuel Sep 18 '17 at 07:40
  • https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – GordonM Sep 18 '17 at 08:41
  • you said that this document contains "key": "value" – HSLM Sep 18 '17 at 09:54
  • @Emmanuel try the code now be sure that $sites_html contains the urls as you said they are like: "key": "value" if they are like that you can use this code – HSLM Sep 18 '17 at 09:56
  • @hassan thanks in advance, but it is not working, if u can run the full code and check it. – Emmanuel Sep 18 '17 at 12:32
  • Hi @Emmanuel this is a working code from my terminal $urls = '"url1": "http://notneeded.com", "url2": "http://notneeded.net", "wanted_url": "http://wantedurl1.com", "wanted_url": "https://wantedurl3.com"'; $matches = [[], []]; preg_match_all('/\"wanted_url\": \"([^\"]*?)\"/', $urls, $matches); foreach($matches[1] as $match) {echo $match . ', '; } http://wantedurl1.com, https://wantedurl3.com, ⏎ >>> – HSLM Sep 18 '17 at 16:55