3

I have links on some pages that use an old system such as:

<a href='/app/?query=stuff_is_here'>This is a link</a>

They need to be converted to the new system which is like:

<a href='/newapp/?q=stuff+is+here'>This is a link</a>

I can use preg_replace t0 change some of what i need to, but i also need to replace underscores in the query with +'s instead. My current code is:

//$content is the page html
$content = preg_replace('#(href)="http://www.site.com/app/?query=([^:"]*)(?:")#','$1="http://www.site.com/newapp/?q=$2"',$content);

What I want to do is run str_replace on the $2 variable, so I tried using preg_replace_callback, and could never get it to work. What should I do?

james
  • 73
  • 1
  • 7
  • 2
    *(related)* [Best Methods to parse HTML](http://stackoverflow.com/questions/3577641/best-methods-to-parse-html/3577662#3577662) – Gordon Sep 01 '11 at 10:41

4 Answers4

3

You have to pass a valid callback [docs] as second parameter: a function name, an anonymous function, etc.

Here is an example:

function my_replace_callback($match) {
    $q = str_replace('_', '+', $match[2]);
    return $match[1] . '="http://www.site.com/newapp/?q=' . $q;
}
$content = preg_replace_callback('#(href)="http://www.site.com/app/?query=([^:"]*)(?:")#', 'my_replace_callback', $content);

Or with PHP 5.3:

$content = preg_replace_callback('#(href)="http://www.site.com/app/?query=([^:"]*)(?:")#', function($match) {
    $q = str_replace('_', '+', $match[2]);
    return $match[1] . '="http://www.site.com/newapp/?q=' . $q;
}, $content);

You may also want to try with a HTML parser instead of a regex: How do you parse and process HTML/XML in PHP?

Community
  • 1
  • 1
Arnaud Le Blanc
  • 98,321
  • 23
  • 206
  • 194
3

Parsing your document with dom, searching for all "a" tags and then replacing could be a good way. Someone already commented posting you this link to show you that regex isn't always the best way to work with html.

Ayways this code should work:

<?php
$dom = new DOMDocument;
//html string contains your html
$dom->loadHTML($html);
?><ul><?
foreach( $dom->getElementsByTagName('a') as $node ) {
    //look for href attribute
    if( $node->hasAttribute( 'href' ) ) {
        $href = $node->getAttribute( 'href' );
        // change hrefs value
         $node->setAttribute( "href", preg_replace( "/\/app\/\?query=(.*)/", "/newapp/?q=\1", $href ) );
    }
}
//save new html
$newHTML = $dom->saveHTML(); 
?>

Notice that i did this with preg_replace but this can be done with str_ireplace or str_replace

$newHref = str_ireplace("/app/?query=", "/newapp/?q=", $href);
Community
  • 1
  • 1
CaNNaDaRk
  • 1,302
  • 12
  • 20
0

Or you can use simply preg_match() and collect matched strings. Then apply str_replace() to one of the matches and replace "+" to "_".

$content = preg_match('#href="\/[^\/]\/\?query=([^:"]+)#', $matches)
$matches[2] = 'newapp';
$matches[4] = str_replace('_', '+', $matches[4]);
$result = implode('', $matches)
s.webbandit
  • 16,332
  • 16
  • 58
  • 82
0

Pass arrays to preg_replace as pattern and replacement:

preg_replace(array('|/app/|', '_'), array('/newappp/', '+'), $content);
knittl
  • 246,190
  • 53
  • 318
  • 364