1

I am trying to scrape a few images from a website by using the PHP Simplehtmldom library. Normally the image url is found in the 'src' but in this case the image url is found in the 'data-url'. I am having trouble to access the data-url value and I was hoping some could help me out.

Let's say I have the following code:

<section id="imagesContainer">
    <div class="img">
        <img data-src="https://example.com/image1.jpg" alt="imageAlt1">
    </div>
    <div class="img">
        <img data-src="https://example.com/image2.jpg" alt="imageAlt2">
    </div>
    <div class="img">
        <img data-src="https://example.com/image3.jpg" alt="imageAlt3">
    </div>
</section>

I attempted to extract the image urls from the data-src with the following code but it doesn't return the image url:

foreach($html->find('#imagesContainer') as $imagesContainer) {
    foreach($imagesContainer->find('img') as $image) {
        echo $image->data-src;

    }

}

How can I extract the image url from the data-src? Is it possible with the simplehtmldom or do I need a regex?

user3398797
  • 429
  • 1
  • 7
  • 16
  • 1
    You definitely don't need regex. Just use XPath. https://stackoverflow.com/questions/8027323/xpath-get-attribute-value-in-php – CAustin Nov 14 '17 at 20:32

1 Answers1

1

many thanks for your suggestion. This code, using XPath, works to extract the value of the data-url:

 $image->getAttribute('data-src')
user3398797
  • 429
  • 1
  • 7
  • 16