0

I need a little bit of help. I got an assignment for school, I need to make a regular expressionscript which get an image (and later upload to the database, but that's not the problem). The real problem is that I get an array with all images from the page, but should be one image, which is: data-src-l="/WebRoot/products/8020/80203122/bilder/80203122.jpg" this is the code from the whole image:

  <li>
    <a href="/WebRoot/products/8020/80203122/bilder/80203122.jpg">
      <img
       itemprop="image"
       alt="Jesus Remember Me - Taize Songs (2CD)"
       src="/WebRoot/AsaphNL/Shops/asaphnl/5422/8F43/62EE/D698/EF8E/4DEB/AED5/3B0E/80203122_xs.jpg"
       data-src-xs="/WebRoot/AsaphNL/Shops/asaphnl/5422/8F43/62EE/D698/EF8E/4DEB/AED5/3B0E/80203122_xs.jpg"
       data-src-s="/WebRoot/products/8020/80203122/bilder/80203122_s.jpg"

       data-src-m="/WebRoot/products/8020/80203122/bilder/80203122_m.jpg"

       data-src-l="/WebRoot/products/8020/80203122/bilder/80203122.jpg"
     />
    </a>
  </li>

</ul>

This is the code with PHP:

<?php
header('Content-Type: text/html; charset=utf-8');
$url = "http://www.asaphshop.nl/epages/asaphnl.sf/nl_NL/?ObjectPath=/Shops/asaphnl/Products/80203122";
$htmlcode = file_get_contents($url);
$pattern = "/<img\s[^>]*?src\s*=\s*['\"]([^'\"]*?)['\"][^>]*?>/";
preg_match_all($pattern, $htmlcode, $matches);
//print_r ($matches);
$image = ($matches[0]);
$image = str_replace('src="/', 'src="http://www.asaphshop.nl/', $image);
print_r ($image);
?>

UPDATE: in front of the imagelink must be the link to http://www.asaphshop.nl, so it will look into the site for the image. not inside my localhost. If you dont understand me, you can ask ;)

Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
bananaman
  • 31
  • 2
  • 9

2 Answers2

1
(<img\s[^>]*?data-src-l\s*=\s*['\"])([^'\"]*?['\"])([^>]*?>)

Try this.This will give the required img.Replace by $1http://www.asaphshop.nl$2$3.See demo.

http://regex101.com/r/wQ1oW3/29

vks
  • 67,027
  • 10
  • 91
  • 124
0

I need a little bit of help. I got an assignment for school, I need to make a regular expression script which get an image (and later upload to the database, but that's not the problem).

Tell your school that regular expressions are not the best tool for the job.

Sure, there is this argument that regular expressions are not so regular and can be used for tasks such as palindrome matching. But that doesn't mean you should use them, since it will cause a lot of headache to you and other developers that might need to work with your code later.

What you should use instead is a proper HTML/XML parser.

Fortunately enough, PHP has what it needs, and it's called DOMDocument. Take a look at its getElementsByTagName method, for example. You could use it to retrieve images. Then you could iterate through all the attributes and parse them the way you want.

Not only it's safer since you don't have to worry about edge cases, it's also more readable.

Community
  • 1
  • 1
rr-
  • 14,303
  • 6
  • 45
  • 67