0
$text = file_get_contents('http://www.example.com/file.php?id=name');
echo preg_replace('#<a.*?>.*?</a>#i', '', $text)

the link contains this content:

text text text. <br><a href='http://www.example.com' target='_blank' title='title' style='text-decoration:none;'>name</a>

what is the problem at this script?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Adrian
  • 141
  • 1
  • 1
  • 4

4 Answers4

3

You can't parse HTML with regular expressions. Use an XML/HTML parser.

Community
  • 1
  • 1
Williham Totland
  • 28,471
  • 6
  • 52
  • 68
1

Tempted to flag your question, but there's no option for "Report user for summoning Cthulhu"

I'd recommend reading: http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html

RegEx is very poor and not at all intended to parse HTML. That's why there are HTML parsing libraries. Find and use one for PHP. :)

Robbie
  • 715
  • 9
  • 19
0

USE strip_tags this way

$t = 'http://yoururl.com/test1.php';
$t1 = file_get_contents($t);
$text = strip_tags($t1);

it should work getting rid of all the links inside the page you are reading, visit the reference anyway, it may not work for complicated elements http://php.net/manual/en/function.strip-tags.php

0

use <a[^>]+>[^<]*</a> (works fine as long as theres just text and no tags inside the a element)

Hannes
  • 8,147
  • 4
  • 33
  • 51