0

I would like to match all the urls from a css I've been using with this regex and it was working very well.

@[^a-z_]{1}url\s*\((?:\'|"|)(.*?)(?:\'|"|)\)@im

Full match: url(https://example/product.png)
Group 1: https://example/product.png

the problem happened when i found a url like this:

background-image: url(/uploads/2019/03/product01-image(thumbnail_photo).jpg);


Full match url(/uploads/2019/03/product01-image(thumbnail_photo)
Group 1. /uploads/2019/03/product01-image(thumbnail_photo

I looked at this topic and tried to use some regex that exist with some modification

preg_match to match src=, background= and url(..)

the result was this

@(?:url\((?:\"|'|)(.*\.(?:[a-z_]{3}))(?:\"|'|)\))@im

Full match: url(/uploads/2019/03/product01-image(thumbnail_photo).jpg)
Group 1: /uploads/2019/03/product01-image(thumbnail_photo).jpg

At first it seemed to work fine but it is broken when I have a situation like:

.card-thumb__img1{display:block;width:142px;height:62px;background:url(https://example.com/product01.jpg) center center no-repeat;background-size:contain}@media (max-width:1029px).card-thumb__img2{display:block;z-index:1;background:url(https://example.com/product02.jpg) center center no-repeat #000;

Full match: url(https://example.com/product01.jpg) center center no-repeat;background-size:contain}@media (max-width:1029px).card-thumb__img2{display:block;z-index:1;background:url(https://example.com/product02.jpg)
Group 1:https://example.com/product01.jpg) center center no-repeat;background-size:contain}@media (max-width:1029px).card-thumb__img2{display:block;z-index:1;background:url(https://example.com/product02.jpg

How can I solve this and get the expected result for all situations?

Edit Some types of occurrences that I have to match

url(https://exemples.com/fonts/lato/lato/lato-regular-webfont.ttf)
src:url(https://exemples.com/fonts/lato/lato-regular-webfont.eot?#iefix)
background:url(https://exemples.com/product/header/img.png)
background:url(/product/header/img.png)
background:url("/product/header/img.png")
background:url('/product/header/img.png')
background:url(/uploads/2019/03/0002-image(thumbnail_product).jpg)
Bruno Andrade
  • 565
  • 1
  • 3
  • 17
  • Please show us all types of occurrences of URLs in your CSS code. – Tim Biegeleisen Apr 07 '19 at 12:46
  • 1
    I'm not certain a browser would even recognise that kind of URL correctly... a bare URL (which is already a potential issue - you should quote those) containing parentheses? I mean what if your filename were `())(())))()()).png`? – Niet the Dark Absol Apr 07 '19 at 12:47
  • This is why regular expressions should not be used as parsers. Use a proper library. A search turned this up: https://github.com/sabberworm/PHP-CSS-Parser – miken32 Apr 08 '19 at 19:19

1 Answers1

1

For your example data, one option could be to recurse the first subpattern (?1 and use a second capturing group for the url.

The url will be in capturing group 2.

url(\(((?:[^()]+|(?1))+)\))

Regex demo | Php demo

Explanation

  • url
  • ( First capturing group
    • \( Match ( char
    • ( Second capturing group
      • (?:[^()]+|(?1))+ Match either 1+ times not what is listed in the character class or recurse the first subpattern and repeat 1+ times
    • ) Close second capturing group
    • \) Match ) char
  • ) Close first capturing group

This will also match the leading and trailing " and ' of a url. You could do another check when getting the matches using a capturing group to verify if the starting type of quote is the same as the end type of quote.

For example:

$re = '/url(\(((?:[^()]+|(?1))+)\))/m';
$str = 'background:url("/product/header/img1.png") and background:url("/product/header/img2.png\' and background:url(/product/header/img3.png"))';

preg_match_all($re, $str, $matches, PREG_SET_ORDER);

foreach ($matches as $match) {
    if (preg_match('/^([\'"]?)[^"]+\1$/', $match[2])) {
        echo trim($match[2], "'\"") . PHP_EOL;
    }
}

Result:

/product/header/img1.png
The fourth bird
  • 154,723
  • 16
  • 55
  • 70