2

hi i'm very new at programing. I dont know how to write a php regulat expression to add something between href=" and some text after it how to make this

<a class="aaa" href="/some/file.html">

to look like

<a class="aaa" href="http://www.example.com/some/file.html">

it is necessary to match links with "aaa" class.

Can anybody help me ?

Andy Lester
  • 91,102
  • 13
  • 100
  • 152
em.rexhepi
  • 289
  • 1
  • 5
  • 17

2 Answers2

2

You better don't even start trying to do this with regular expressions.

You should use a DOM parser for tasks like this. This one for example makes your life really easy.

$html = new simple_html_dom();
$html->load($input);

foreach($html->find('a[class=aaa]') as $link)
    $link->href = "http://www.example.com".$link->href;

$result = $html->save();

find lets you query the DOM very nicely. The parameter is tagtype[attributeName=attributeValue] where the square brackets are an optional filter. Then you just iterate over every link this function finds, and prepend the href attribute with your domain.

If you cannot use 3rd-party libraries for some reason, PHP comes with a built-in DOM module. The code will not be quite as short and elegant, but it is still highly preferable to trying to come up with a robust regex.

Community
  • 1
  • 1
Martin Ender
  • 43,427
  • 11
  • 90
  • 130
  • I really prisiate your help, and i know it will work if i know how to use it but i can't put my code together with your code can you pls do a quick preview on the post below, I posted my code as a ANSWER. Thank you very mutch m.buettner – em.rexhepi Dec 14 '12 at 11:57
  • ps. I dont know how to include that Simple HTML DOM on my file – em.rexhepi Dec 14 '12 at 13:20
  • This is a great example. I've added it to http://htmlparsing.com/php.html so future HTML parsers can see it. – Andy Lester Dec 14 '12 at 13:59
  • @user1898399 you download the php file from [here](http://sourceforge.net/projects/simplehtmldom/files/), put it into your project and call `include "simple_html_dom.php";` somewhere. then you rename the variable in your code below to `$input` and copy my code after that. by the way, you can find the `example`-div using `simple_html_dom`, too. no need for regex at all – Martin Ender Dec 14 '12 at 15:27
0

You could do it this way:

$string = '<a class="aaa" href="/some/file.html">';
$pattern = '~class="aaa" href="(.*)"~isU'; 
preg_match($pattern, $string, $matches);

$string = str_replace($matches[1],"http://www.example.com".$matches[1],$string);

echo $string;

Edited to match class="aaa" but I'd also recommend m.buettners way if you're doing things like that a lot.

Evo_x
  • 2,997
  • 5
  • 24
  • 40
  • that again, will apply the regex to all links (not only the ones with class `aaa`). plus, there is no need to match and capture anything after `href="` – Martin Ender Dec 13 '12 at 23:07
  • You're absolutely right, I kinda read over the "it is necessary to match links with "aaa" class." - my fault. I also agree with you about the simple html dom, I use it a lot. I just thought this task is simple so he could use regex. – Evo_x Dec 13 '12 at 23:10
  • thank you for you replay but I'm geting this message: Warning: preg_match() expects parameter 2 to be string, array given in – em.rexhepi Dec 14 '12 at 12:07