0

Extract text from URL ?

trying this preg_match

/\<a href=([^"]*) .?\>([^\<\/a]*)\<\/a\>+/

Not working on

<a href="_first.asp?FileName=37479676820111216064143">        
<font size="2" face="Tahoma">
TEXT I WANT TO EXTRACT
</font>
</a>

am sure there's something wrong with ([^\<\/a]*) am just too bad in regex and can't find a good tutorial even !

Oliver Charlesworth
  • 267,707
  • 33
  • 569
  • 680
Rami Dabain
  • 4,709
  • 12
  • 62
  • 106

2 Answers2

0

In the very start, you have href=, then any number of non-quotes (which is zero in your example, since the next character is a quote), and then a space (which fails your expression, since the next character is not a quote, but a space).

In any case, while this is doable with regexps as long as the structure doesn't change, it's not really the way to do it.

Community
  • 1
  • 1
Amadan
  • 191,408
  • 23
  • 240
  • 301
0

Maybe:

/^<a[^>]+>(?:\s*<[^>]+>)*\s*([^<]+)(?:\s*<\/[^>]+>)*\s*<\/a>$/m

will work?

fge
  • 119,121
  • 33
  • 254
  • 329