0

Right now I use a PHP snippet to grab some content from a website with the following code, and show the content in my project and its work good.

<?php
$lines = file ("https://the-url.com");
for ($i = 1664; $i <= 2325; $i++) {
   echo $lines[$i];
}
?>

But the problem is that sometimes the code in the middle of the site change and i have to update the php snippet with the new lines from the content.

How can write the PHP code so that i say i need the content between

<div class="results" id="result_list">

And

<h3 id="alternativen_text" style="float: left; width: 100%;padding-bottom: 6px">

can i use like ?

<?php
$lines = file ("https://the-url.com");
preg_match('/<div class="results" id="result_list">(.*?)<\/<h3 id="alternativen_text" style="float: left; width: 100%;padding-bottom: 6px">/s', $lines, $match[1]); {
   echo $match[1];
}
?>
bruno2000
  • 85
  • 1
  • 7

1 Answers1

1

PHP only executes once, at page load. This means that for this, you would need to use JavaScript instead of PHP, or you can only get the code in the middle of the site will only be grabbed once every page load.

You can use this code for PHP, though. It takes the content in between the <div> and <h3>, and echos it out (but you can do anything with it).

function grab_string_between($str, $starting_word, $ending_word)
{
    $sub_start= strpos($str, $starting_word);
    $sub_start += strlen($starting_word); 
    $size = strpos($str, $ending_word, $sub_start) - $sub_start; 
    return substr($str, $sub_start, $size); 
}
 
$str = file_get_contents("/to/page.php"); 
$substring = grab_string_between($str, '<div class="results" id="result_list">', '<h3 id="alternativen_text"');
 
echo $substring;

The only way to do this if the content changes multiple times after the page load though, is with JavaScript.

ethry
  • 731
  • 6
  • 17
  • 1
    Are you interpreting "sometimes the code in the middle of the site change" as the DOM is updating periodically after load? I interpreted the OP's statement to mean that the HTML they are scraping changes, specifically that lines 1664 through 2325 change. But it is more of an artifact that some change, either in content or in the structure, is changing those lines, just at the HTML level. It might be good to clarify things a bit more, DOM vs HTML – Chris Haas May 11 '22 at 19:57
  • Yeah, I interpreted it as it kept changing multiple times. – ethry May 11 '22 at 19:59
  • after the page load the content dont change anymore – bruno2000 May 11 '22 at 20:49
  • If i use the code below it load the complete content from the site , i need only the part between – bruno2000 May 11 '22 at 21:00
  • @bruno2000 Are you sure you're using the right code? It gets the part in between for me – ethry May 12 '22 at 22:13
  • 1
    Now its work with your code. Thanks a lot. It solve my problem. Great – bruno2000 May 13 '22 at 11:47