0

I'd like to be able to return an array with a list of all images (src="" values) from html

[0] = "images/header.jpg" [1] = "images/person.jpg"

is there a regular expression that can do this?

Many thanks in advance!

Amien
  • 954
  • 2
  • 11
  • 18
  • possible duplicate of [Best methods to parse HTML](http://stackoverflow.com/questions/3577641/best-methods-to-parse-html) – Pekka Apr 19 '11 at 18:40

4 Answers4

3

Welcome to the world of the millionth "how to exactract these values using regex" question ;-) I suggest to use the search tool before seeking an answer -- here is just a handful of topics that provide code to do exactly what you need;

Community
  • 1
  • 1
Gary Green
  • 22,045
  • 6
  • 49
  • 75
1

Here is a more polished version of the regular expression provided by Håvard:

/(?<=src=")[^"]+(?=")/

This expression uses Lookahead & Lookbehind Assertions to get only what you want.

$str = '<img src="/img/001.jpg"><img src="/img/002.jpg">';

preg_match_all('/(?<=src=")[^"]+(?=")/', $str, $srcs, PREG_PATTERN_ORDER);

print_r($srcs);

The output will look like the following:

Array
(
    [0] => Array
        (
            [0] => /img/001.jpg
            [1] => /img/002.jpg
        )

)
DrupalFever
  • 4,302
  • 1
  • 17
  • 10
  • using src regex get js also in my case ( for example : [28] => /modules/jcarousel/assets/carousel_start.js ) – Viktors May 05 '15 at 09:36
1

I see that many peoples struggle with Håvard's post and <script> issue. Here is same solution on more strict way:

<img.*?src="([^"]+)".*?>

Example:

preg_match_all('/<img.*?src="([^"]+)".*?>/', '<img src="lol"><img src="wat">', $arr, PREG_PATTERN_ORDER);

Returns:

Array
(
    [1] => Array
        (
            [0] => "lol"
            [1] => "wat"
        )

)

This will avoid other tags to be matched. HERE is example.

Ivijan Stefan Stipić
  • 6,249
  • 6
  • 45
  • 78
1

/src="([^"]+)"/

The image will be in group 1.

Example:

preg_match_all('/src="([^"]+)"/', '<img src="lol"><img src="wat">', $arr, PREG_PATTERN_ORDER);

Returns:

Array
(
    [0] => Array
        (
            [0] => src="lol"
            [1] => src="wat"
        )

    [1] => Array
        (
            [0] => lol
            [1] => wat
        )

)
Håvard
  • 9,900
  • 1
  • 41
  • 46
  • 2
    `` — Regular expressions do not play well with HTML, overly simplistic regular expressions less so. – Quentin Apr 19 '11 at 18:49