How to get tag's attribute using PHP simple HTML DOM parser

Question

I am using the PHP Simple HTML DOM parser to scrap website data, but unfortunately not able to extract the data i want to. I have also tried to google and look in the documentation but could not solve the issue. The code structure of what i am trying to scrap is something like this.

<div id="section1">
   <h1>Some content</h1>
   <p>Some content</p>
   ............
    <<Not fixed number of element>>
   ............
   <script> <<Some script>></script>
   <video>
     <source src="www.exmple.com/34/exmple.mp4">
   </video>
</div>

I tried with JavaScript and i could do the same like this

document.getElementById("section1").getElementsByTagName("source")[0].getAttribute("src");

But when i tried with PHP Dom parser i m not getting any data. Here is how my code looks likes

require ''.$_SERVER['DOCUMENT_ROOT'].'/../lib/simplehtmldom/simple_html_dom.php';

 $html_content = get($url); //This is cURL function to get website content.
 $obj_content = str_get_html($html_content);
 $linkURL = $obj_content->getElementById('section1')->find('source',0)->getAttribute('src');
var_dump($linkURL);

This results in an empty string. I also tried changing to code a bit here and there but none of those works every time came blank. But if i var dump $obj_content i get lot of dom element

I tried to follow these posts from stackoverflow which are similar to mine , but these did not help me.

Can anyone please help me

Thank you

No the page load once. There is no dynamically adding after that — user7747472, Aug 13 '18 at 16:52
So if you var_dump whatever is returned from your cURL request, do you see this source tag with a value in the src attribute? — WillardSolutions, Aug 13 '18 at 16:55
OK then - look at the HTML from the var_dump, find the #section1 > source[0] path, and see if there's a value in the src attribute. — WillardSolutions, Aug 13 '18 at 17:11
This works: `$dom->getElementById('section1')->find('video', 0)->find('source', 0)->getAttribute('src');` The key is to find the parent ` — drew010, Aug 13 '18 at 18:04
@WillardSolutions, you were correct. The source file url that i am trying to fetch is actually getting injected by the JS script that is above video tag. Extracting content of the script tag and striping the content i took out the url i wanted. — user7747472, Aug 21 '18 at 08:29

score 0 · Accepted Answer · answered Aug 21 '18 at 08:31

0

The code snippet is fine as it is. Problem was that the URL that I was targeting was not there at the time of page load. It was added by the <script> tag after page being loaded.

Thank you @WillardSolutions

answered Aug 21 '18 at 08:31

user7747472

1,874
6
36
80

How to get tag's attribute using PHP simple HTML DOM parser

1 Answers1