0

For example, I have this HTML snippet:

<a href="/sites/all/themes/code.php">some text</a>

The question is - how to cut the text /sites/all/themes/code.php from the href with preg_replace(); which pattern could I use?

Peter Boughton
  • 110,170
  • 32
  • 120
  • 176
Alexander Kim
  • 17,304
  • 23
  • 100
  • 157
  • Possible Dupe: http://stackoverflow.com/questions/2792900/regex-javascript-to-match-href – chown Oct 01 '11 at 19:13

3 Answers3

3

I would strongly recommend against using regular expressions to parse any SGML derivative.

For HTML use some DOM parser. For PHP specifically there is DOMDocument.

Alexander Olsson
  • 1,908
  • 1
  • 15
  • 24
0

pattern:

(<a .*?href=")([^"]*)

replacement: $1

Alexey
  • 909
  • 6
  • 11
0

you don't have to do a "replace"

(?<=<a href=")[^"]*(?=">) 

brings you what you want directly.

test with grep:

kent$  echo '<a href="/sites/all/themes/code.php">some text</a>'|grep -oP '(?<=<a href=")[^"]*(?=">)'                                    
/sites/all/themes/code.php
Kent
  • 189,393
  • 32
  • 233
  • 301